Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 17 |
Descriptor
Evaluation Methods | 23 |
Goodness of Fit | 23 |
Test Items | 23 |
Item Response Theory | 10 |
Models | 10 |
Simulation | 7 |
Item Analysis | 6 |
Psychometrics | 6 |
Factor Analysis | 5 |
Test Construction | 5 |
Educational Assessment | 4 |
More ▼ |
Source
Author
Smith, Richard M. | 2 |
Arendasy, Martin | 1 |
Bolt, Daniel M. | 1 |
Close, Catherine | 1 |
Clough, Peter J. | 1 |
Cook Whitt, Katahdin | 1 |
Crust, Lee | 1 |
Cui, Ying | 1 |
Davison, Mark L. | 1 |
Debelak, Rudolf | 1 |
Dowdy, Erin | 1 |
More ▼ |
Publication Type
Journal Articles | 18 |
Reports - Research | 13 |
Reports - Evaluative | 6 |
Reports - Descriptive | 4 |
Speeches/Meeting Papers | 3 |
Numerical/Quantitative Data | 1 |
Opinion Papers | 1 |
Education Level
High Schools | 2 |
Middle Schools | 2 |
Secondary Education | 2 |
Higher Education | 1 |
Audience
Researchers | 1 |
Laws, Policies, & Programs
Assessments and Surveys
California Achievement Tests | 1 |
Medical College Admission Test | 1 |
National Assessment of… | 1 |
What Works Clearinghouse Rating
Joakim Wallmark; James O. Ramsay; Juan Li; Marie Wiberg – Journal of Educational and Behavioral Statistics, 2024
Item response theory (IRT) models the relationship between the possible scores on a test item against a test taker's attainment of the latent trait that the item is intended to measure. In this study, we compare two models for tests with polytomously scored items: the optimal scoring (OS) model, a nonparametric IRT model based on the principles of…
Descriptors: Item Response Theory, Test Items, Models, Scoring
Wan, Siyu; Keller, Lisa A. – Practical Assessment, Research & Evaluation, 2023
Statistical process control (SPC) charts have been widely used in the field of educational measurement. The cumulative sum (CUSUM) is an established SPC method to detect aberrant responses for educational assessments. There are many studies that investigated the performance of CUSUM in different test settings. This paper describes the CUSUM…
Descriptors: Visual Aids, Educational Assessment, Evaluation Methods, Item Response Theory
Lozano, José H.; Revuelta, Javier – Educational and Psychological Measurement, 2023
The present paper introduces a general multidimensional model to measure individual differences in learning within a single administration of a test. Learning is assumed to result from practicing the operations involved in solving the items. The model accounts for the possibility that the ability to learn may manifest differently for correct and…
Descriptors: Bayesian Statistics, Learning Processes, Test Items, Item Analysis
Todd, Amber; Romine, William L.; Cook Whitt, Katahdin – Science Education, 2017
We describe the development, validation, and use of the "Learning Progression-Based Assessment of Modern Genetics" (LPA-MG) in a high school biology context. Items were constructed based on a current learning progression framework for genetics (Shea & Duncan, 2013; Todd & Kenyon, 2015). The 34-item instrument, which was tied to…
Descriptors: Genetics, Science Instruction, High School Students, Evaluation Methods
Debelak, Rudolf; Arendasy, Martin – Educational and Psychological Measurement, 2012
A new approach to identify item clusters fitting the Rasch model is described and evaluated using simulated and real data. The proposed method is based on hierarchical cluster analysis and constructs clusters of items that show a good fit to the Rasch model. It thus gives an estimate of the number of independent scales satisfying the postulates of…
Descriptors: Test Items, Factor Analysis, Evaluation Methods, Simulation
Maydeu-Olivares, Alberto – Measurement: Interdisciplinary Research and Perspectives, 2013
In this rejoinder, Maydeu-Olivares states that, in item response theory (IRT) measurement applications, the application of goodness-of-fit (GOF) methods informs researchers of the discrepancy between the model and the data being fitted (the room for improvement). By routinely reporting the GOF of IRT models, together with the substantive results…
Descriptors: Goodness of Fit, Models, Evaluation Methods, Item Response Theory
Perry, John L.; Clough, Peter J.; Crust, Lee; Nabb, Sam L.; Nicholls, Adam R. – Research Quarterly for Exercise and Sport, 2015
Purpose: A new measure of sportspersonship, which differentiates between compliance and principled approaches, was developed and initially validated in 3 studies. Method: Study 1 developed items, assessed content validity, and proposed a model. Study 2 tested the factorial validity of the model on an independent sample. Study 3 further tested the…
Descriptors: Program Development, Program Validation, Physical Education, Compliance (Legal)
Dowdy, Erin; Furlong, Michael J.; Sharkey, Jill D. – Journal of Emotional and Behavioral Disorders, 2013
This study examined the potential utility of adding items that assessed youths' emotional and behavioral disorders to a commonly used surveillance survey. The goal was to evaluate whether the added items could enhance understanding of youths' involvement in high-risk behaviors. A sample of 3,331 adolescents in Grades 8, 10, and 12 from four…
Descriptors: Behavior Disorders, Adolescents, Addictive Behavior, Surveys
Johnson, Philip; Tymms, Peter – Journal of Research in Science Teaching, 2011
Previously, a small scale, interview-based, 3-year longitudinal study (ages 11-14) in one school had suggested a learning progression related to the concept of a substance. This article presents the results of a large-scale, cross-sectional study which used Rasch modeling to test the hypothesis of the learning progression. Data were collected from…
Descriptors: Computer Assisted Testing, Chemistry, Measures (Individuals), Foreign Countries
Davison, Mark L.; Kim, Se-Kang; Close, Catherine – Multivariate Behavioral Research, 2009
A profile is a vector of scores for one examinee. The mean score in the vector can be interpreted as a measure of overall profile height, the variance can be interpreted as a measure of within person variation, and the ipsatized vector of score deviations about the mean can be said to describe the pattern in the score profile. A within person…
Descriptors: Vocational Interests, Interest Inventories, Profiles, Scores
Roberts, James S. – Applied Psychological Measurement, 2008
Orlando and Thissen (2000) developed an item fit statistic for binary item response theory (IRT) models known as S-X[superscript 2]. This article generalizes their statistic to polytomous unfolding models. Four alternative formulations of S-X[superscript 2] are developed for the generalized graded unfolding model (GGUM). The GGUM is a…
Descriptors: Item Response Theory, Goodness of Fit, Test Items, Models
Zhang, Bo; Walker, Cindy M. – Applied Psychological Measurement, 2008
The purpose of this research was to examine the effects of missing data on person-model fit and person trait estimation in tests with dichotomous items. Under the missing-completely-at-random framework, four missing data treatment techniques were investigated including pairwise deletion, coding missing responses as incorrect, hotdeck imputation,…
Descriptors: Item Response Theory, Computation, Goodness of Fit, Test Items
Shujuan, Wang; Meihua, Qian; Jianxin, Zhang – Journal of Psychoeducational Assessment, 2009
This article examines the psychometric structure of the Anxiety Control Questionnaire (ACQ) in Chinese adolescents. With the data collected from 212 senior high school students (94 females, 110 males, 8 unknown), seven models are tested using confirmatory factor analyses in the framework of the multitrait-multimethod strategy. Results indicate…
Descriptors: Multitrait Multimethod Techniques, Factor Structure, Adolescents, Measures (Individuals)
Wells, Craig S.; Bolt, Daniel M. – Applied Measurement in Education, 2008
Tests of model misfit are often performed to validate the use of a particular model in item response theory. Douglas and Cohen (2001) introduced a general nonparametric approach for detecting misfit under the two-parameter logistic model. However, the statistical properties of their approach, and empirical comparisons to other methods, have not…
Descriptors: Test Length, Test Items, Monte Carlo Methods, Nonparametric Statistics
Cui, Ying; Leighton, Jacqueline P. – Journal of Educational Measurement, 2009
In this article, we introduce a person-fit statistic called the hierarchy consistency index (HCI) to help detect misfitting item response vectors for tests developed and analyzed based on a cognitive model. The HCI ranges from -1.0 to 1.0, with values close to -1.0 indicating that students respond unexpectedly or differently from the responses…
Descriptors: Test Length, Simulation, Correlation, Research Methodology
Previous Page | Next Page »
Pages: 1 | 2