Publication Date
In 2025 | 0 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 8 |
Since 2006 (last 20 years) | 10 |
Descriptor
Scoring | 11 |
Test Items | 11 |
Models | 7 |
Item Response Theory | 5 |
Foreign Countries | 4 |
Simulation | 4 |
Achievement Tests | 3 |
Comparative Analysis | 3 |
Computation | 3 |
Item Analysis | 3 |
Bias | 2 |
More ▼ |
Source
Journal of Educational and… | 11 |
Author
Allan S. Cohen | 1 |
Cai, Li | 1 |
Cai, Yan | 1 |
Chung, Seungwon | 1 |
Gao, Xuliang | 1 |
James O. Ramsay | 1 |
Jiang, Yu | 1 |
Joakim Wallmark | 1 |
Jordan M. Wheeler | 1 |
Juan Li | 1 |
Junker, Brian W. | 1 |
More ▼ |
Publication Type
Journal Articles | 11 |
Reports - Research | 7 |
Reports - Descriptive | 3 |
Reports - Evaluative | 1 |
Education Level
Secondary Education | 3 |
Elementary Education | 2 |
Grade 8 | 2 |
Higher Education | 2 |
Junior High Schools | 2 |
Middle Schools | 2 |
Postsecondary Education | 2 |
Elementary Secondary Education | 1 |
Grade 4 | 1 |
Intermediate Grades | 1 |
Audience
Location
China | 2 |
Sweden | 1 |
United States | 1 |
Laws, Policies, & Programs
Assessments and Surveys
National Assessment of… | 1 |
Program for International… | 1 |
United States Medical… | 1 |
What Works Clearinghouse Rating
Joakim Wallmark; James O. Ramsay; Juan Li; Marie Wiberg – Journal of Educational and Behavioral Statistics, 2024
Item response theory (IRT) models the relationship between the possible scores on a test item against a test taker's attainment of the latent trait that the item is intended to measure. In this study, we compare two models for tests with polytomously scored items: the optimal scoring (OS) model, a nonparametric IRT model based on the principles of…
Descriptors: Item Response Theory, Test Items, Models, Scoring
Jordan M. Wheeler; Allan S. Cohen; Shiyu Wang – Journal of Educational and Behavioral Statistics, 2024
Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming…
Descriptors: Semantics, Educational Assessment, Evaluators, Reliability
Sakworawich, Arnond; Wainer, Howard – Journal of Educational and Behavioral Statistics, 2020
Test scoring models vary in their generality, some even adjust for examinees answering multiple-choice items correctly by accident (guessing), but no models, that we are aware of, automatically adjust an examinee's score when there is internal evidence of cheating. In this study, we use a combination of jackknife technology with an adaptive robust…
Descriptors: Scoring, Cheating, Test Items, Licensing Examinations (Professions)
Gao, Xuliang; Ma, Wenchao; Wang, Daxun; Cai, Yan; Tu, Dongbo – Journal of Educational and Behavioral Statistics, 2021
This article proposes a class of cognitive diagnosis models (CDMs) for polytomously scored items with different link functions. Many existing polytomous CDMs can be considered as special cases of the proposed class of polytomous CDMs. Simulation studies were carried out to investigate the feasibility of the proposed CDMs and the performance of…
Descriptors: Cognitive Measurement, Models, Test Items, Scoring
Chung, Seungwon; Cai, Li – Journal of Educational and Behavioral Statistics, 2021
In the research reported here, we propose a new method for scale alignment and test scoring in the context of supporting students with disabilities. In educational assessment, students from these special populations take modified tests because of a demonstrated disability that requires more assistance than standard testing accommodation. Updated…
Descriptors: Students with Disabilities, Scoring, Achievement Tests, Test Items
Ramsay, James O.; Wiberg, Marie – Journal of Educational and Behavioral Statistics, 2017
This article promotes the use of modern test theory in testing situations where sum scores for binary responses are now used. It directly compares the efficiencies and biases of classical and modern test analyses and finds an improvement in the root mean squared error of ability estimates of about 5% for two designed multiple-choice tests and…
Descriptors: Scoring, Test Theory, Computation, Maximum Likelihood Statistics
Jiang, Yu; Zhang, Jiahui; Xin, Tao – Journal of Educational and Behavioral Statistics, 2019
This article is an overview of the National Assessment of Education Quality (NAEQ) of China in reading, mathematics, sciences, arts, physical education, and moral education at Grades 4 and 8. After a review of the background and history of NAEQ, we present the assessment framework with students' holistic development at the core and the design for…
Descriptors: Foreign Countries, Educational Quality, Educational Improvement, National Competency Tests
Yang, Ji Seung; Zheng, Xiaying – Journal of Educational and Behavioral Statistics, 2018
The purpose of this article is to introduce and review the capability and performance of the Stata item response theory (IRT) package that is available from Stata v.14, 2015. Using a simulated data set and a publicly available item response data set extracted from Programme of International Student Assessment, we review the IRT package from…
Descriptors: Item Response Theory, Item Analysis, Computer Software, Statistical Analysis
Longford, Nicholas T. – Journal of Educational and Behavioral Statistics, 2015
An equating procedure for a testing program with evolving distribution of examinee profiles is developed. No anchor is available because the original scoring scheme was based on expert judgment of the item difficulties. Pairs of examinees from two administrations are formed by matching on coarsened propensity scores derived from a set of…
Descriptors: Equated Scores, Testing Programs, College Entrance Examinations, Scoring
Mariano, Louis T.; Junker, Brian W. – Journal of Educational and Behavioral Statistics, 2007
When constructed response test items are scored by more than one rater, the repeated ratings allow for the consideration of individual rater bias and variability in estimating student proficiency. Several hierarchical models based on item response theory have been introduced to model such effects. In this article, the authors demonstrate how these…
Descriptors: Test Items, Item Response Theory, Rating Scales, Scoring
Segall, Daniel O. – Journal of Educational and Behavioral Statistics, 2004
A new sharing item response theory (SIRT) model is presented that explicitly models the effects of sharing item content between informants and test takers. This model is used to construct adaptive item selection and scoring rules that provide increased precision and reduced score gains in instances where sharing occurs. The adaptive item selection…
Descriptors: Scoring, Item Analysis, Item Response Theory, Adaptive Testing