ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	6

Descriptor

Error of Measurement	6
Test Reliability	6
Item Response Theory	3
Accuracy	2
Evaluation Methods	2
International Assessment	2
Scores	2
Scoring Rubrics	2
Test Items	2
Achievement Tests	1
Adaptive Testing	1
Bayesian Statistics	1
Cognitive Measurement	1
Cognitive Tests	1
Comparative Testing	1
Computer Assisted Testing	1
Correlation	1
Data Interpretation	1
Educational Diagnosis	1
Educational Testing	1
Effect Size	1
Foreign Countries	1
Growth Models	1
Guessing (Tests)	1
High Stakes Tests	1
More ▼

Source

ProQuest LLC

Author

Gulsah Gurkan	1
Jiayi Deng	1
Pei-Hsuan Chiu	1
Topczewski, Anna Marie	1
Wenjing Guo	1
Yu Wang	1

Publication Type

Dissertations/Theses -…

Education Level

Secondary Education

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…	1
Program for International…	1

What Works Clearinghouse Rating

Showing all 6 results Save | Export

Linking Errors Introduced by Rapid Guessing Responses When Employing Multigroup Concurrent IRT Scaling

Direct link

Jiayi Deng – ProQuest LLC, 2024

Test score comparability in international large-scale assessments (LSA) is of utmost importance in measuring the effectiveness of education systems and understanding the impact of education on economic growth. To effectively compare test scores on an international scale, score linking is widely used to convert raw scores from different linguistic…

Descriptors: Item Response Theory, Scoring Rubrics, Scoring, Error of Measurement

Cognitive Diagnosis for Multiple-Choice Responses: Nonparametric Classification Method, Q-Matrix Theory, and Computerized Adaptive Testing

Direct link

Yu Wang – ProQuest LLC, 2024

The multiple-choice (MC) item format has been widely used in educational assessments across diverse content domains. MC items purportedly allow for collecting richer diagnostic information. The effectiveness and economy of administering MC items may have further contributed to their popularity not just in educational assessment. The MC item format…

Descriptors: Multiple Choice Tests, Cognitive Tests, Cognitive Measurement, Educational Diagnosis

Exploring Rating Quality in the Context of High-Stakes Rater-Mediated Educational Assessments

Direct link

Wenjing Guo – ProQuest LLC, 2021

Constructed response (CR) items are widely used in large-scale testing programs, including the National Assessment of Educational Progress (NAEP) and many district and state-level assessments in the United States. One unique feature of CR items is that they depend on human raters to assess the quality of examinees' work. The judgment of human…

Descriptors: National Competency Tests, Responses, Interrater Reliability, Error of Measurement

Bayesian Approaches to Test Score Measurement Errors in Student Growth Prediction Models

Direct link

Pei-Hsuan Chiu – ProQuest LLC, 2018

Evidence of student growth is a primary outcome of interest for educational accountability systems. When three or more years of student test data are available, questions around how students grow and what their predicted growth is can be answered. Given that test scores contain measurement error, this error should be considered in growth and…

Descriptors: Bayesian Statistics, Scores, Error of Measurement, Growth Models

From OLS to Multilevel Multidimensional Mixture IRT: A Model Refinement Approach to Investigating Patterns of Relationships in PISA 2012 Data

Direct link

Gulsah Gurkan – ProQuest LLC, 2021

Secondary analyses of international large-scale assessments (ILSA) commonly characterize relationships between variables of interest using correlations. However, the accuracy of correlation estimates is impaired by artefacts such as measurement error and clustering. Despite advancements in methodology, conventional correlation estimates or…

Descriptors: Secondary School Students, Achievement Tests, International Assessment, Foreign Countries

Effect of Violating Unidimensional Item Response Theory Vertical Scaling Assumptions on Developmental Score Scales

Direct link

Topczewski, Anna Marie – ProQuest LLC, 2013

Developmental score scales represent the performance of students along a continuum, where as students learn more they move higher along that continuum. Unidimensional item response theory (UIRT) vertical scaling has become a commonly used method to create developmental score scales. Research has shown that UIRT vertical scaling methods can be…

Descriptors: Item Response Theory, Scaling, Scores, Student Development