ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	5
Since 2016 (last 10 years)	12
Since 2006 (last 20 years)	17

Source

Applied Measurement in…

Publication Type

Journal Articles	21
Reports - Research	17
Reports - Evaluative	3
Reports - Descriptive	1

Education Level

Secondary Education	4
High Schools	3
Higher Education	3
Postsecondary Education	3
Elementary Education	1
Elementary Secondary Education	1
Grade 10	1
Grade 3	1
Junior High Schools	1
Middle Schools	1

Audience

Location

Canada	3
France	1
Germany	1
Jordan	1
New York	1
United Kingdom	1

Laws, Policies, & Programs

Assessments and Surveys

Preliminary Scholastic…	1
Program for International…	1
SAT (College Admission Test)	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 21 results Save | Export

Violation of Conditional Independence in the Many-Facets Rasch Model

Peer reviewed

Direct link

DeMars, Christine E. – Applied Measurement in Education, 2021

Estimation of parameters for the many-facets Rasch model requires that conditional on the values of the facets, such as person ability, item difficulty, and rater severity, the observed responses within each facet are independent. This requirement has often been discussed for the Rasch models and 2PL and 3PL models, but it becomes more complex…

Descriptors: Item Response Theory, Test Items, Ability, Scores

Detecting Local Dependence: A Threshold-Autoregressive Item Response Theory (TAR-IRT) Approach for Polytomous Items

Peer reviewed

Direct link

Tang, Xiaodan; Karabatsos, George; Chen, Haiqin – Applied Measurement in Education, 2020

In applications of item response theory (IRT) models, it is known that empirical violations of the local independence (LI) assumption can significantly bias parameter estimates. To address this issue, we propose a threshold-autoregressive item response theory (TAR-IRT) model that additionally accounts for order dependence among the item responses…

Descriptors: Item Response Theory, Test Items, Models, Computation

Bayesian Estimation and Testing of a Linear Logistic Test Model for Learning during the Test

Peer reviewed

Direct link

Lozano, José H.; Revuelta, Javier – Applied Measurement in Education, 2021

The present study proposes a Bayesian approach for estimating and testing the operation-specific learning model, a variant of the linear logistic test model that allows for the measurement of the learning that occurs during a test as a result of the repeated use of the operations involved in the items. The advantages of using a Bayesian framework…

Descriptors: Bayesian Statistics, Computation, Learning, Testing

Characterizing the Latent Classes in a Mixture IRT Model Using DIF

Peer reviewed

Direct link

Karadavut, Tugba – Applied Measurement in Education, 2021

Mixture IRT models address the heterogeneity in a population by extracting latent classes and allowing item parameters to vary between latent classes. Once the latent classes are extracted, they need to be further examined to be characterized. Some approaches have been adopted in the literature for this purpose. These approaches examine either the…

Descriptors: Item Response Theory, Models, Test Items, Maximum Likelihood Statistics

Dissecting Knowledge, Guessing, and Blunder in Multiple Choice Assessments

Peer reviewed

Direct link

Abu-Ghazalah, Rashid M.; Dubins, David N.; Poon, Gregory M. K. – Applied Measurement in Education, 2023

Multiple choice results are inherently probabilistic outcomes, as correct responses reflect a combination of knowledge and guessing, while incorrect responses additionally reflect blunder, a confidently committed mistake. To objectively resolve knowledge from responses in an MC test structure, we evaluated probabilistic models that explicitly…

Descriptors: Guessing (Tests), Multiple Choice Tests, Probability, Models

The Trade-Off between Model Fit, Invariance, and Validity: The Case of PISA Science Assessments

Peer reviewed

Direct link

El Masri, Yasmine H.; Andrich, David – Applied Measurement in Education, 2020

In large-scale educational assessments, it is generally required that tests are composed of items that function invariantly across the groups to be compared. Despite efforts to ensure invariance in the item construction phase, for a range of reasons (including the security of items) it is often necessary to account for differential item…

Descriptors: Models, Goodness of Fit, Test Validity, Achievement Tests

Examining Three Learning Progressions in Middle-School Mathematics for Formative Assessment

Peer reviewed

Direct link

Pham, Duy N.; Wells, Craig S.; Bauer, Malcolm I.; Wylie, E. Caroline; Monroe, Scott – Applied Measurement in Education, 2021

Assessments built on a theory of learning progressions are promising formative tools to support learning and teaching. The quality and usefulness of those assessments depend, in large part, on the validity of the theory-informed inferences about student learning made from the assessment results. In this study, we introduced an approach to address…

Descriptors: Formative Evaluation, Mathematics Instruction, Mathematics Achievement, Middle School Students

A Comparison of Estimation Techniques for IRT Models with Small Samples

Peer reviewed

Direct link

Finch, Holmes; French, Brian F. – Applied Measurement in Education, 2019

The usefulness of item response theory (IRT) models depends, in large part, on the accuracy of item and person parameter estimates. For the standard 3 parameter logistic model, for example, these parameters include the item parameters of difficulty, discrimination, and pseudo-chance, as well as the person ability parameter. Several factors impact…

Descriptors: Item Response Theory, Accuracy, Test Items, Difficulty Level

A New Procedure for Detection of Students' Rapid Guessing Responses Using Response Time

Peer reviewed

Direct link

Guo, Hongwen; Rios, Joseph A.; Haberman, Shelby; Liu, Ou Lydia; Wang, Jing; Paek, Insu – Applied Measurement in Education, 2016

Unmotivated test takers using rapid guessing in item responses can affect validity studies and teacher and institution performance evaluation negatively, making it critical to identify these test takers. The authors propose a new nonparametric method for finding response-time thresholds for flagging item responses that result from rapid-guessing…

Descriptors: Guessing (Tests), Reaction Time, Nonparametric Statistics, Models

Focusing on Interactions between Content and Cognition: A New Perspective on Gender Differences in Mathematical Sub-Competencies

Peer reviewed

Direct link

George, Ann Cathrice; Robitzsch, Alexander – Applied Measurement in Education, 2018

This article presents a new perspective on measuring gender differences in the large-scale assessment study Trends in International Science Study (TIMSS). The suggested empirical model is directly based on the theoretical competence model of the domain mathematics and thus includes the interaction between content and cognitive sub-competencies.…

Descriptors: Achievement Tests, Elementary Secondary Education, Mathematics Achievement, Mathematics Tests

Using Necessary Information to Identify Item Dependence in Passage-Based Reading Comprehension Tests

Peer reviewed

Direct link

Baldonado, Angela Argo; Svetina, Dubravka; Gorin, Joanna – Applied Measurement in Education, 2015

Applications of traditional unidimensional item response theory models to passage-based reading comprehension assessment data have been criticized based on potential violations of local independence. However, simple rules for determining dependency, such as including all items associated with a particular passage, may overestimate the dependency…

Descriptors: Reading Tests, Reading Comprehension, Test Items, Item Response Theory

Evaluating the Psychometric Characteristics of Generated Multiple-Choice Test Items

Peer reviewed

Direct link

Gierl, Mark J.; Lai, Hollis; Pugh, Debra; Touchie, Claire; Boulais, André-Philippe; De Champlain, André – Applied Measurement in Education, 2016

Item development is a time- and resource-intensive process. Automatic item generation integrates cognitive modeling with computer technology to systematically generate test items. To date, however, items generated using cognitive modeling procedures have received limited use in operational testing situations. As a result, the psychometric…

Descriptors: Psychometrics, Multiple Choice Tests, Test Items, Item Analysis

Parameter Recovery and Classification Accuracy under Conditions of Testlet Dependency: A Comparison of the Traditional 2PL, Testlet, and Bi-Factor Models

Peer reviewed

Direct link

Koziol, Natalie A. – Applied Measurement in Education, 2016

Testlets, or groups of related items, are commonly included in educational assessments due to their many logistical and conceptual advantages. Despite their advantages, testlets introduce complications into the theory and practice of educational measurement. Responses to items within a testlet tend to be correlated even after controlling for…

Descriptors: Classification, Accuracy, Comparative Analysis, Models

Testing Expert-Based versus Student-Based Cognitive Models for a Grade 3 Diagnostic Mathematics Assessment

Peer reviewed

Direct link

Roduta Roberts, Mary; Alves, Cecilia B.; Chu, Man-Wai; Thompson, Margaret; Bahry, Louise M.; Gotzmann, Andrea – Applied Measurement in Education, 2014

The purpose of this study was to evaluate the adequacy of three cognitive models, one developed by content experts and two generated from student verbal reports for explaining examinee performance on a grade 3 diagnostic mathematics test. For this study, the items were developed to directly measure the attributes in the cognitive model. The…

Descriptors: Foreign Countries, Mathematics Tests, Cognitive Processes, Models

Modeling the Psychometric Properties of Complex Performance Assessment Tasks Using Confirmatory Factor Analysis: A Multistage Model for Calibrating Tasks

Peer reviewed

Direct link

Kahraman, Nilufer; De Champlain, Andre; Raymond, Mark – Applied Measurement in Education, 2012

Item-level information, such as difficulty and discrimination are invaluable to the test assembly, equating, and scoring practices. Estimating these parameters within the context of large-scale performance assessments is often hindered by the use of unbalanced designs for assigning examinees to tasks and raters because such designs result in very…

Descriptors: Performance Based Assessment, Medicine, Factor Analysis, Test Items

Previous Page | Next Page »

Pages: 1 | 2

Models	21
Test Items	21
Item Response Theory	11
Difficulty Level	8
Scores	6
Foreign Countries	5
Psychometrics	5
Statistical Analysis	5
Computation	4
Bayesian Statistics	3
Cognitive Processes	3
Item Analysis	3
Mathematics Achievement	3
Mathematics Skills	3
Mathematics Tests	3
Maximum Likelihood Statistics	3
Responses	3
Test Construction	3
Test Validity	3
Accuracy	2
Achievement Tests	2
Classification	2
Comparative Analysis	2
Computer Assisted Testing	2
Diagnostic Tests	2
More ▼

Abu-Ghazalah, Rashid M.	1
Ackerman, Terry A.	1
Alves, Cecilia B.	1
Andrich, David	1
Bahry, Louise M.	1
Baldonado, Angela Argo	1
Bauer, Malcolm I.	1
Bennett, Randy Elliot	1
Boulais, André-Philippe	1
Chen, Haiqin	1
Chu, Man-Wai	1
Cor, M. Ken	1
Cui, Ying	1
De Champlain, Andre	1
De Champlain, André	1
DeMars, Christine E.	1
Drasgow, Fritz	1
Dubins, David N.	1
El Masri, Yasmine H.	1
Finch, Holmes	1
French, Brian F.	1
Frisbie, David A.	1
George, Ann Cathrice	1
Gierl, Mark J.	1
More ▼