ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	5
Since 2017 (last 10 years)	10
Since 2007 (last 20 years)	24

Descriptor

Scoring	37
Simulation	37
Test Items	37
Item Response Theory	20
Computer Assisted Testing	14
Adaptive Testing	12
Models	10
Comparative Analysis	9
Item Analysis	9
Test Construction	8
Statistical Analysis	7
Test Bias	7
Correlation	6
Test Length	6
Accuracy	5
Computation	5
Evaluation Methods	5
Goodness of Fit	5
Maximum Likelihood Statistics	5
Scores	5
Computer Software	4
Error of Measurement	4
Evaluators	4
Mathematical Models	4
Psychometrics	4
More ▼

Source

ETS Research Report Series	5
Journal of Educational and…	4
ProQuest LLC	4
Journal of Educational…	3
Applied Psychological…	2
Educational and Psychological…	2
American Journal of…	1
Applied Measurement in…	1
Assessment & Evaluation in…	1
IGI Global	1
International Journal of…	1
Journal of Technology,…	1
Psychometrika	1
Society for Research on…	1
More ▼

Publication Type

Journal Articles	22
Reports - Research	21
Reports - Evaluative	8
Speeches/Meeting Papers	5
Dissertations/Theses -…	4
Reports - Descriptive	3
Books	1
Collected Works - General	1
Tests/Questionnaires	1

Education Level

Secondary Education	3
Higher Education	2
Junior High Schools	2
Middle Schools	2
Postsecondary Education	2
Early Childhood Education	1
Elementary Education	1
Grade 4	1
Grade 8	1
Intermediate Grades	1
Preschool Education	1
More ▼

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Law School Admission Test	1
National Assessment of…	1
Program for International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 37 results Save | Export

Analyzing Polytomous Test Data: A Comparison between an Information-Based IRT Model and the Generalized Partial Credit Model

Peer reviewed

Direct link

Joakim Wallmark; James O. Ramsay; Juan Li; Marie Wiberg – Journal of Educational and Behavioral Statistics, 2024

Item response theory (IRT) models the relationship between the possible scores on a test item against a test taker's attainment of the latent trait that the item is intended to measure. In this study, we compare two models for tests with polytomously scored items: the optimal scoring (OS) model, a nonparametric IRT model based on the principles of…

Descriptors: Item Response Theory, Test Items, Models, Scoring

Online Calibration in Multidimensional Computerized Adaptive Testing with Polytomously Scored Items

Peer reviewed

Direct link

Yuan, Lu; Huang, Yingshi; Li, Shuhang; Chen, Ping – Journal of Educational Measurement, 2023

Online calibration is a key technology for item calibration in computerized adaptive testing (CAT) and has been widely used in various forms of CAT, including unidimensional CAT, multidimensional CAT (MCAT), CAT with polytomously scored items, and cognitive diagnostic CAT. However, as multidimensional and polytomous assessment data become more…

Descriptors: Computer Assisted Testing, Adaptive Testing, Computation, Test Items

A Comparison of Latent Semantic Analysis and Latent Dirichlet Allocation in Educational Measurement

Peer reviewed

Direct link

Jordan M. Wheeler; Allan S. Cohen; Shiyu Wang – Journal of Educational and Behavioral Statistics, 2024

Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming…

Descriptors: Semantics, Educational Assessment, Evaluators, Reliability

Maintaining Score Scales over Time: A Comparison of Five Scoring Methods

Peer reviewed

Direct link

Kim, Stella Yun; Lee, Won-Chan – Applied Measurement in Education, 2023

This study evaluates various scoring methods including number-correct scoring, IRT theta scoring, and hybrid scoring in terms of scale-score stability over time. A simulation study was conducted to examine the relative performance of five scoring methods in terms of preserving the first two moments of scale scores for a population in a chain of…

Descriptors: Scoring, Comparative Analysis, Item Response Theory, Simulation

Using Existing Data to Inform Development of New Item Types. Research Report. ETS RR-20-01

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Ling, Guangming; Frankel, Lois – ETS Research Report Series, 2020

With advances in technology, researchers and test developers are developing new item types to measure complex skills like problem solving and critical thinking. Analyzing such items is often challenging because of their complicated response patterns, and thus it is important to develop psychometric methods for practitioners and researchers to…

Descriptors: Test Construction, Test Items, Item Analysis, Psychometrics

Developing Multistage Tests Using "D"-Scoring Method

Peer reviewed

Direct link

Han, Kyung T.; Dimitrov, Dimiter M.; Al-Mashary, Faisal – Educational and Psychological Measurement, 2019

The "D"-scoring method for scoring and equating tests with binary items proposed by Dimitrov offers some of the advantages of item response theory, such as item-level difficulty information and score computation that reflects the item difficulties, while retaining the merits of classical test theory such as the simplicity of number…

Descriptors: Test Construction, Scoring, Test Items, Adaptive Testing

Routing Strategies and Optimizing Design for Multistage Testing in International Large-Scale Assessments

Peer reviewed

Direct link

Svetina, Dubravka; Liaw, Yuan-Ling; Rutkowski, Leslie; Rutkowski, David – Journal of Educational Measurement, 2019

This study investigates the effect of several design and administration choices on item exposure and person/item parameter recovery under a multistage test (MST) design. In a simulation study, we examine whether number-correct (NC) or item response theory (IRT) methods are differentially effective at routing students to the correct next stage(s)…

Descriptors: Measurement, Item Analysis, Test Construction, Item Response Theory

Item Order and Speededness: Implications for Test Fairness in Higher Educational High-Stakes Testing

Peer reviewed

Direct link

Becker, Benjamin; van Rijn, Peter; Molenaar, Dylan; Debeer, Dries – Assessment & Evaluation in Higher Education, 2022

A common approach to increase test security in higher educational high-stakes testing is the use of different test forms with identical items but different item orders. The effects of such varied item orders are relatively well studied, but findings have generally been mixed. When multiple test forms with different item orders are used, we argue…

Descriptors: Information Security, High Stakes Tests, Computer Security, Test Items

A Fair Comparison of the Performance of Computerized Adaptive Testing and Multistage Adaptive Testing

Direct link

Wang, Keyin – ProQuest LLC, 2017

The comparison of item-level computerized adaptive testing (CAT) and multistage adaptive testing (MST) has been researched extensively (e.g., Kim & Plake, 1993; Luecht et al., 1996; Patsula, 1999; Jodoin, 2003; Hambleton & Xing, 2006; Keng, 2008; Zheng, 2012). Various CAT and MST designs have been investigated and compared under the same…

Descriptors: Comparative Analysis, Computer Assisted Testing, Adaptive Testing, Test Items

Item Response Data Analysis Using Stata Item Response Theory Package

Peer reviewed

Direct link

Yang, Ji Seung; Zheng, Xiaying – Journal of Educational and Behavioral Statistics, 2018

The purpose of this article is to introduce and review the capability and performance of the Stata item response theory (IRT) package that is available from Stata v.14, 2015. Using a simulated data set and a publicly available item response data set extracted from Programme of International Student Assessment, we review the IRT package from…

Descriptors: Item Response Theory, Item Analysis, Computer Software, Statistical Analysis

On the Bias-Amplifying Effect of Near Instruments in Observational Studies

Peer reviewed
PDF on ERIC

Download full text

Steiner, Peter M.; Kim, Yongnam – Society for Research on Educational Effectiveness, 2014

In contrast to randomized experiments, the estimation of unbiased treatment effects from observational data requires an analysis that conditions on all confounding covariates. Conditioning on covariates can be done via standard parametric regression techniques or nonparametric matching like propensity score (PS) matching. The regression or…

Descriptors: Observation, Research Methodology, Test Bias, Regression (Statistics)

Modeling Rater Effects and Complex Learning Progressions Using Item Response Models

Direct link

Shin, Hyo Jeong – ProQuest LLC, 2015

This dissertation is comprised of three papers that propose and apply psychometric models to deal with complexities and challenges in large-scale assessments, focusing on modeling rater effects and complex learning progressions. In particular, three papers investigate extensions and applications of multilevel and multidimensional item response…

Descriptors: Item Response Theory, Psychometrics, Models, Measurement

An Item-Driven Adaptive Design for Calibrating Pretest Items. Research Report. ETS RR-14-38

Peer reviewed
PDF on ERIC

Download full text

Ali, Usama S.; Chang, Hua-Hua – ETS Research Report Series, 2014

Adaptive testing is advantageous in that it provides more efficient ability estimates with fewer items than linear testing does. Item-driven adaptive pretesting may also offer similar advantages, and verification of such a hypothesis about item calibration was the main objective of this study. A suitability index (SI) was introduced to adaptively…

Descriptors: Adaptive Testing, Simulation, Pretests Posttests, Test Items

A Comparison of Item Calibration Procedures in the Presence of Test Speededness

Peer reviewed

Direct link

Suh, Youngsuk; Cho, Sun-Joo; Wollack, James A. – Journal of Educational Measurement, 2012

In the presence of test speededness, the parameter estimates of item response theory models can be poorly estimated due to conditional dependencies among items, particularly for end-of-test items (i.e., speeded items). This article conducted a systematic comparison of five-item calibration procedures--a two-parameter logistic (2PL) model, a…

Descriptors: Response Style (Tests), Timed Tests, Test Items, Item Response Theory

Treatment of Not-Administered Items on Individually Administered Intelligence Tests

Peer reviewed

Direct link

He, Wei; Wolfe, Edward W. – Educational and Psychological Measurement, 2012

In administration of individually administered intelligence tests, items are commonly presented in a sequence of increasing difficulty, and test administration is terminated after a predetermined number of incorrect answers. This practice produces stochastically censored data, a form of nonignorable missing data. By manipulating four factors…

Descriptors: Individual Testing, Intelligence Tests, Test Items, Test Length

Previous Page | Next Page »

Pages: 1 | 2 | 3

Lee, Won-Chan	2
Segall, Daniel O.	2
Stocking, Martha L.	2
Al-Mashary, Faisal	1
Ali, Usama S.	1
Allan S. Cohen	1
Becker, Benjamin	1
Breyer, F. Jay	1
Brull, Harry	1
Chang, Hua-Hua	1
Chen, Ping	1
Chernyshenko, Oleksandr S.	1
Cho, Sun-Joo	1
Cook, Linda L.	1
De Ayala, R. J.	1
Debeer, Dries	1
Deng, Nina	1
Dimitrov, Dimiter M.	1
Economides, Anastasios A.	1
Fergadiotis, Gerasimos	1
Ferrara, Steve, Ed.	1
Frankel, Lois	1
Georgiadou, Elissavet	1
Guo, Hongwen	1
More ▼