ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	9

Source

Educational and Psychological…

Author

Wyse, Adam E.	2
Babcock, Ben	1
Breithaupt, Krista	1
Carvajal, Jorge	1
Chen, Hui-Fang	1
Hare, Donovan R.	1
Huggins-Manley, Anne Corinne	1
Jiao, Hong	1
Jin, Kuan-Yu	1
Keller, Lisa A.	1
Keller, Robert R.	1
Lee, Guemin	1
Leite, Walter	1
Lewis, Daniel M.	1
Skorupski, William P.	1
Wang, Shudong	1
Wang, Wen-Chung	1
Xue, Kang	1
More ▼

Publication Type

Journal Articles	9
Reports - Research	7
Reports - Evaluative	2

Education Level

Grade 5	2
Grade 8	2
Elementary Education	1
Elementary Secondary Education	1
Grade 10	1
Grade 11	1
Grade 3	1
Grade 4	1
Grade 6	1
Grade 7	1
Grade 9	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1
More ▼

Audience

Location

Florida	1
Hong Kong	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing all 9 results Save | Export

Semisupervised Learning Method to Adjust Biased Item Difficulty Estimates Caused by Nonignorable Missingness in a Virtual Learning Environment

Peer reviewed
PDF on ERIC

Download full text

Direct link

Xue, Kang; Huggins-Manley, Anne Corinne; Leite, Walter – Educational and Psychological Measurement, 2022

In data collected from virtual learning environments (VLEs), item response theory (IRT) models can be used to guide the ongoing measurement of student ability. However, such applications of IRT rely on unbiased item parameter estimates associated with test items in the VLE. Without formal piloting of the items, one can expect a large amount of…

Descriptors: Virtual Classrooms, Artificial Intelligence, Item Response Theory, Item Analysis

How Does Calibration Timing and Seasonality Affect Item Parameter Estimates?

Peer reviewed

Direct link

Wyse, Adam E.; Babcock, Ben – Educational and Psychological Measurement, 2016

Continuously administered examination programs, particularly credentialing programs that require graduation from educational programs, often experience seasonality where distributions of examine ability may differ over time. Such seasonality may affect the quality of important statistical processes, such as item response theory (IRT) item…

Descriptors: Test Items, Item Response Theory, Computation, Licensing Examinations (Professions)

Item Response Theory Models for Wording Effects in Mixed-Format Scales

Peer reviewed

Direct link

Wang, Wen-Chung; Chen, Hui-Fang; Jin, Kuan-Yu – Educational and Psychological Measurement, 2015

Many scales contain both positively and negatively worded items. Reverse recoding of negatively worded items might not be enough for them to function as positively worded items do. In this study, we commented on the drawbacks of existing approaches to wording effect in mixed-format scales and used bi-factor item response theory (IRT) models to…

Descriptors: Item Response Theory, Test Format, Language Usage, Test Items

The Long-Term Sustainability of Different Item Response Theory Scaling Methods

Peer reviewed

Direct link

Keller, Lisa A.; Keller, Robert R. – Educational and Psychological Measurement, 2011

This article investigates the accuracy of examinee classification into performance categories and the estimation of the theta parameter for several item response theory (IRT) scaling techniques when applied to six administrations of a test. Previous research has investigated only two administrations; however, many testing programs equate tests…

Descriptors: Item Response Theory, Scaling, Sustainability, Classification

Peer reviewed

Direct link

Wyse, Adam E. – Educational and Psychological Measurement, 2011

Standard setting is a method used to set cut scores on large-scale assessments. One of the most popular standard setting methods is the Bookmark method. In the Bookmark method, panelists are asked to envision a response probability (RP) criterion and move through a booklet of ordered items based on a RP criterion. This study investigates whether…

Descriptors: Testing Programs, Standard Setting (Scoring), Cutting Scores, Probability

A Comparison of Approaches for Improving the Reliability of Objective Level Scores

Peer reviewed

Direct link

Skorupski, William P.; Carvajal, Jorge – Educational and Psychological Measurement, 2010

This study is an evaluation of the psychometric issues associated with estimating objective level scores, often referred to as "subscores." The article begins by introducing the concepts of reliability and validity for subscores from statewide achievement tests. These issues are discussed with reference to popular scaling techniques, classical…

Descriptors: Testing Programs, Test Validity, Achievement Tests, Scores

A Generalizability Theory Approach to Standard Error Estimates for Bookmark Standard Settings

Peer reviewed

Direct link

Lee, Guemin; Lewis, Daniel M. – Educational and Psychological Measurement, 2008

The bookmark standard-setting procedure is an item response theory-based method that is widely implemented in state testing programs. This study estimates standard errors for cut scores resulting from bookmark standard settings under a generalizability theory model and investigates the effects of different universes of generalization and error…

Descriptors: Generalizability Theory, Testing Programs, Error of Measurement, Cutting Scores

Construct Equivalence across Grades in a Vertical Scale for a K-12 Large-Scale Reading Assessment

Peer reviewed

Direct link

Wang, Shudong; Jiao, Hong – Educational and Psychological Measurement, 2009

In practice, vertical scales have been continually used to measure students' achievement progress across several grade levels and have been considered very challenging psychometric procedures. Recently, such practices have been drawing many criticisms. The major criticisms focus on dimensionality and construct equivalence of the latent trait or…

Descriptors: Reading Comprehension, Elementary Secondary Education, Measures (Individuals), Psychometrics

Automated Simultaneous Assembly of Multistage Testlets for a High-Stakes Licensing Examination

Peer reviewed

Direct link

Breithaupt, Krista; Hare, Donovan R. – Educational and Psychological Measurement, 2007

Many challenges exist for high-stakes testing programs offering continuous computerized administration. The automated assembly of test questions to exactly meet content and other requirements, provide uniformity, and control item exposure can be modeled and solved by mixed-integer programming (MIP) methods. A case study of the computerized…

Descriptors: Testing Programs, Psychometrics, Certification, Accounting

Testing Programs	9
Item Response Theory	7
Test Items	4
Computation	3
Psychometrics	3
Cutting Scores	2
Difficulty Level	2
Scores	2
State Programs	2
Academic Ability	1
Accounting	1
Achievement Tests	1
Artificial Intelligence	1
Bayesian Statistics	1
Beliefs	1
Case Studies	1
Certification	1
Classification	1
Comparative Analysis	1
Computer Assisted Testing	1
Computer Software	1
Correlation	1
Criterion Referenced Tests	1
Data Analysis	1
Elementary Secondary Education	1
More ▼