Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 7 |
Since 2006 (last 20 years) | 8 |
Descriptor
Comparative Analysis | 11 |
Models | 11 |
Item Response Theory | 5 |
Scores | 4 |
Accuracy | 3 |
Educational Assessment | 3 |
Error of Measurement | 3 |
Simulation | 3 |
Achievement Tests | 2 |
Bayesian Statistics | 2 |
Classification | 2 |
More ▼ |
Source
Applied Measurement in… | 11 |
Author
Finch, Holmes | 2 |
Allen, Jeff | 1 |
Crone, Linda J. | 1 |
Custer, Michael | 1 |
French, Brian F. | 1 |
Ing, Marsha | 1 |
Koziol, Natalie A. | 1 |
Lee, Won-Chan | 1 |
Mehrens, William A. | 1 |
Omar, Md Hafidz | 1 |
Phillips, S. E. | 1 |
More ▼ |
Publication Type
Journal Articles | 11 |
Reports - Research | 8 |
Reports - Evaluative | 3 |
Information Analyses | 1 |
Education Level
Middle Schools | 2 |
Elementary Education | 1 |
Elementary Secondary Education | 1 |
Grade 10 | 1 |
Grade 11 | 1 |
Grade 4 | 1 |
Grade 5 | 1 |
Grade 6 | 1 |
Grade 9 | 1 |
High Schools | 1 |
Intermediate Grades | 1 |
More ▼ |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Finch, Holmes – Applied Measurement in Education, 2022
Much research has been devoted to identification of differential item functioning (DIF), which occurs when the item responses for individuals from two groups differ after they are conditioned on the latent trait being measured by the scale. There has been less work examining differential step functioning (DSF), which is present for polytomous…
Descriptors: Comparative Analysis, Item Response Theory, Item Analysis, Simulation
Song, Yoon Ah; Lee, Won-Chan – Applied Measurement in Education, 2022
This article presents the performance of item response theory (IRT) models when double ratings are used as item scores over single ratings when rater effects are present. Study 1 examined the influence of the number of ratings on the accuracy of proficiency estimation in the generalized partial credit model (GPCM). Study 2 compared the accuracy of…
Descriptors: Item Response Theory, Item Analysis, Scores, Accuracy
Finch, Holmes; French, Brian F. – Applied Measurement in Education, 2019
The usefulness of item response theory (IRT) models depends, in large part, on the accuracy of item and person parameter estimates. For the standard 3 parameter logistic model, for example, these parameters include the item parameters of difficulty, discrimination, and pseudo-chance, as well as the person ability parameter. Several factors impact…
Descriptors: Item Response Theory, Accuracy, Test Items, Difficulty Level
Yi, Yeon-Sook – Applied Measurement in Education, 2017
This study compares five cognitive diagnostic models in search of optimal one(s) for English as a Second Language grammar test data. Using a unified modeling framework that can represent specific models with proper constraints, the article first fit the full model (the log-linear cognitive diagnostic model, LCDM) and investigated which model…
Descriptors: English (Second Language), Grammar, Language Tests, Cognitive Measurement
Ing, Marsha – Applied Measurement in Education, 2016
Drawing inferences about the extent to which student performance reflects instructional opportunities relies on the premise that the measure of student performance is reflective of instructional opportunities. An instructional sensitivity framework suggests that some assessments are more sensitive to detecting differences in instructional…
Descriptors: Mathematics Tests, Mathematics Achievement, Performance, Educational Opportunities
Allen, Jeff – Applied Measurement in Education, 2017
Using a sample of schools testing annually in grades 9-11 with a vertically linked series of assessments, a latent growth curve model is used to model test scores with student intercepts and slopes nested within school. Missed assessments can occur because of student mobility, student dropout, absenteeism, and other reasons. Missing data…
Descriptors: Achievement Gains, Academic Achievement, Growth Models, Scores
Koziol, Natalie A. – Applied Measurement in Education, 2016
Testlets, or groups of related items, are commonly included in educational assessments due to their many logistical and conceptual advantages. Despite their advantages, testlets introduce complications into the theory and practice of educational measurement. Responses to items within a testlet tend to be correlated even after controlling for…
Descriptors: Classification, Accuracy, Comparative Analysis, Models
Custer, Michael; Omar, Md Hafidz; Pomplun, Mark – Applied Measurement in Education, 2006
This study compared vertical scaling results for the Rasch model from BILOG-MG and WINSTEPS. The item and ability parameters for the simulated vocabulary tests were scaled across 11 grades; kindergarten through 10th. Data were based on real data and were simulated under normal and skewed distribution assumptions. WINSTEPS and BILOG-MG were each…
Descriptors: Models, Scaling, Computer Software, Vocabulary

Mehrens, William A.; Phillips, S. E. – Applied Measurement in Education, 1989
A sequential decision-making approach based on college grade point averages and test scores for teacher licensure decisions within the conjunctive model is contrasted with the compensatory model for decision making. Criteria for choosing one model over another and a rationale for favoring the conjunctive model are provided. (TJH)
Descriptors: Comparative Analysis, Cutting Scores, Decision Making, Grade Point Average

Crone, Linda J.; And Others – Applied Measurement in Education, 1995
The feasibility of combining criterion-referenced and norm-referenced tests into a school achievement test score to be used for school effectiveness classification was studied with 361 elementary schools across 2 years and across a subsample of 264 schools. Results support combining scores of different tests to measure school effectiveness. (SLD)
Descriptors: Achievement Tests, Classification, Comparative Analysis, Criterion Referenced Tests

Plake, Barbara S. – Applied Measurement in Education, 1995
This article provides a framework for the rest of the articles in this special issue comparing the utility of three standard-setting methods with complex performance assessments. The context of the standard setting study is described, and the methods are outlined. (SLD)
Descriptors: Comparative Analysis, Criteria, Decision Making, Educational Assessment