ERIC - Search Results

Publication Date

In 2025	1
Since 2024	2
Since 2021 (last 5 years)	5
Since 2016 (last 10 years)	15
Since 2006 (last 20 years)	53

Descriptor

Comparative Analysis	73
Evaluation Methods	73
Test Items	73
Item Response Theory	26
Simulation	17
Test Bias	17
Item Analysis	16
Foreign Countries	15
Scores	14
Test Construction	14
Achievement Tests	11
Models	10
Student Evaluation	10
Measurement Techniques	9
Psychometrics	9
Computer Assisted Testing	8
Correlation	8
Educational Assessment	8
Equated Scores	8
Difficulty Level	7
Factor Analysis	7
Mathematics Achievement	7
Mathematics Tests	7
Science Achievement	7
Evaluation Criteria	6
More ▼

Publication Type

Journal Articles	55
Reports - Research	38
Reports - Evaluative	19
Reports - Descriptive	8
Speeches/Meeting Papers	7
Dissertations/Theses -…	4
Information Analyses	4
Numerical/Quantitative Data	3
Tests/Questionnaires	3
Collected Works - General	1
Guides - Non-Classroom	1
Non-Print Media	1
Opinion Papers	1
Reference Materials - General	1
More ▼

Education Level

Elementary Secondary Education	9
Higher Education	8
Grade 4	4
Grade 8	4
Postsecondary Education	4
Elementary Education	3
Secondary Education	3
Early Childhood Education	2
Grade 5	2
Grade 6	2
Intermediate Grades	2
Grade 10	1
Grade 11	1
Grade 3	1
Grade 7	1
Grade 9	1
High Schools	1
Kindergarten	1
Middle Schools	1
Primary Education	1
More ▼

Audience

Practitioners	1
Researchers	1

Location

Canada	3
United States	2
Asia	1
Australia	1
Germany	1
Hong Kong	1
India	1
Ohio	1
Portugal	1
Slovakia	1
Texas	1
United Kingdom	1
United Kingdom (Great Britain)	1
Washington	1
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Program for International…	6
Trends in International…	5
SAT (College Admission Test)	3
National Assessment of…	2
ACT Assessment	1
Armed Services Vocational…	1
Graduate Record Examinations	1
Progress in International…	1
State of Texas Assessments of…	1
Woodcock Johnson Tests of…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 73 results Save | Export

IRT Linking Methods for the Bifactor Model with Mixed Format Tests

Peer reviewed

Direct link

Sohee Kim; Ki Lynn Cole – International Journal of Testing, 2025

This study conducted a comprehensive comparison of Item Response Theory (IRT) linking methods applied to a bifactor model, examining their performance on both multiple choice (MC) and mixed format tests within the common item nonequivalent group design framework. Four distinct multidimensional IRT linking approaches were explored, consisting of…

Descriptors: Item Response Theory, Comparative Analysis, Models, Item Analysis

Analyzing Polytomous Test Data: A Comparison between an Information-Based IRT Model and the Generalized Partial Credit Model

Peer reviewed

Direct link

Joakim Wallmark; James O. Ramsay; Juan Li; Marie Wiberg – Journal of Educational and Behavioral Statistics, 2024

Item response theory (IRT) models the relationship between the possible scores on a test item against a test taker's attainment of the latent trait that the item is intended to measure. In this study, we compare two models for tests with polytomously scored items: the optimal scoring (OS) model, a nonparametric IRT model based on the principles of…

Descriptors: Item Response Theory, Test Items, Models, Scoring

The Concurrent Validity of Comparative Judgement Outcomes Compared with Marks

Download full text

Gill, Tim – Research Matters, 2022

In Comparative Judgement (CJ) exercises, examiners are asked to look at a selection of candidate scripts (with marks removed) and order them in terms of which they believe display the best quality. By including scripts from different examination sessions, the results of these exercises can be used to help with maintaining standards. Results from…

Descriptors: Comparative Analysis, Decision Making, Scripts, Standards

Establishing Statistical Significance for Comparisons Using Pattern-Based Items: Change at Scale

Peer reviewed
PDF on ERIC

Download full text

Walter M. Stroup; Anthony Petrosino; Corey Brady; Karen Duseau – North American Chapter of the International Group for the Psychology of Mathematics Education, 2023

Tests of statistical significance often play a decisive role in establishing the empirical warrant of evidence-based research in education. The results from pattern-based assessment items, as introduced in this paper, are categorical and multimodal and do not immediately support the use of measures of central tendency as typically related to…

Descriptors: Statistical Significance, Comparative Analysis, Research Methodology, Evaluation Methods

An Intersectional Approach to DIF: Comparing Outcomes across Methods

Peer reviewed

Direct link

Russell, Michael; Szendey, Olivia; Li, Zhushan – Educational Assessment, 2022

Recent research provides evidence that an intersectional approach to defining reference and focal groups results in a higher percentage of comparisons flagged for potential DIF. The study presented here examined the generalizability of this pattern across methods for examining DIF. While the level of DIF detection differed among the four methods…

Descriptors: Comparative Analysis, Item Analysis, Test Items, Test Construction

A Log-Linear Modeling Approach for Differential Item Functioning Detection in Polytomously Scored Items

Peer reviewed

Direct link

Yesiltas, Gonca; Paek, Insu – Educational and Psychological Measurement, 2020

A log-linear model (LLM) is a well-known statistical method to examine the relationship among categorical variables. This study investigated the performance of LLM in detecting differential item functioning (DIF) for polytomously scored items via simulations where various sample sizes, ability mean differences (impact), and DIF types were…

Descriptors: Simulation, Sample Size, Item Analysis, Scores

Calibrated Parsing Items Evaluation: A Step towards Objectifying the Translation Assessment

Peer reviewed

Direct link

Akbari, Alireza; Shahnazari, Mohammadtaghi – Language Testing in Asia, 2019

The present research paper introduces a translation evaluation method called Calibrated Parsing Items Evaluation (CPIE hereafter). This evaluation method maximizes translators' performance through identifying the parsing items with an optimal p-docimology and d-index (item discrimination). This method checks all the possible parses (annotations)…

Descriptors: Test Items, Translation, Computer Software, Evaluators

A Practical Comparison of Selected Methods of Evaluating Multiple-Choice Options through Classical Item Analysis

Peer reviewed
PDF on ERIC

Download full text

Malec, Wojciech; Krzeminska-Adamek, Malgorzata – Practical Assessment, Research & Evaluation, 2020

The main objective of the article is to compare several methods of evaluating multiple-choice options through classical item analysis. The methods subjected to examination include the tabulation of choice distribution, the interpretation of trace lines, the point-biserial correlation, the categorical analysis of trace lines, and the investigation…

Descriptors: Comparative Analysis, Evaluation Methods, Multiple Choice Tests, Item Analysis

Five Methods for Estimating Angoff Cut Scores with IRT

Peer reviewed

Direct link

Wyse, Adam E. – Educational Measurement: Issues and Practice, 2017

This article illustrates five different methods for estimating Angoff cut scores using item response theory (IRT) models. These include maximum likelihood (ML), expected a priori (EAP), modal a priori (MAP), and weighted maximum likelihood (WML) estimators, as well as the most commonly used approach based on translating ratings through the test…

Descriptors: Cutting Scores, Item Response Theory, Bayesian Statistics, Maximum Likelihood Statistics

Releasing Content to Deter Cheating: An Analysis of the Impact on Candidate Performance

Peer reviewed

Direct link

Wolkowitz, Amanda A.; Davis-Becker, Susan L.; Gerrow, Jack D. – Journal of Applied Testing Technology, 2016

The purpose of this study was to investigate the impact of a cheating prevention strategy employed for a professional credentialing exam that involved releasing over 7,000 active and retired exam items. This study evaluated: 1) If any significant differences existed between examinee performance on released versus non-released items; 2) If item…

Descriptors: Cheating, Test Content, Test Items, Foreign Countries

A Maximum Likelihood Based Offline Estimation of Student Capabilities and Question Difficulties with Guessing

Peer reviewed

Direct link

Moothedath, Shana; Chaporkar, Prasanna; Belur, Madhu N. – Perspectives in Education, 2016

In recent years, the computerised adaptive test (CAT) has gained popularity over conventional exams in evaluating student capabilities with desired accuracy. However, the key limitation of CAT is that it requires a large pool of pre-calibrated questions. In the absence of such a pre-calibrated question bank, offline exams with uncalibrated…

Descriptors: Guessing (Tests), Computer Assisted Testing, Adaptive Testing, Maximum Likelihood Statistics

The Comparability of Scores from Different Digital Devices: A Literature Review and Synthesis with Recommendations for Practice

Peer reviewed

Direct link

Dadey, Nathan; Lyons, Susan; DePascale, Charles – Applied Measurement in Education, 2018

Evidence of comparability is generally needed whenever there are variations in the conditions of an assessment administration, including variations introduced by the administration of an assessment on multiple digital devices (e.g., tablet, laptop, desktop). This article is meant to provide a comprehensive examination of issues relevant to the…

Descriptors: Evaluation Methods, Computer Assisted Testing, Educational Technology, Technology Uses in Education

Assessing the Impact of Characteristics of the Test, Common-Items, and Examinees on the Preservation of Equity Properties in Mixed-Format Test Equating

Direct link

Wolf, Raffaela – ProQuest LLC, 2013

Preservation of equity properties was examined using four equating methods--IRT True Score, IRT Observed Score, Frequency Estimation, and Chained Equipercentile--in a mixed-format test under a common-item nonequivalent groups (CINEG) design. Equating of mixed-format tests under a CINEG design can be influenced by factors such as attributes of the…

Descriptors: Testing, Item Response Theory, Equated Scores, Test Items

Differential Item Functioning Assessment in Cognitive Diagnostic Modeling: Application of the Wald Test to Investigate DIF in the DINA Model

Peer reviewed

Direct link

Hou, Likun; de la Torre, Jimmy; Nandakumar, Ratna – Journal of Educational Measurement, 2014

Analyzing examinees' responses using cognitive diagnostic models (CDMs) has the advantage of providing diagnostic information. To ensure the validity of the results from these models, differential item functioning (DIF) in CDMs needs to be investigated. In this article, the Wald test is proposed to examine DIF in the context of CDMs. This study…

Descriptors: Test Bias, Models, Simulation, Error Patterns

Investigating Causal DIF via Propensity Score Methods

Peer reviewed
PDF on ERIC

Download full text

Liu, Yan; Zumbo, Bruno D.; Gustafson, Paul; Huang, Yi; Kroc, Edward; Wu, Amery D. – Practical Assessment, Research & Evaluation, 2016

A variety of differential item functioning (DIF) methods have been proposed and used for ensuring that a test is fair to all test takers in a target population in the situations of, for example, a test being translated to other languages. However, once a method flags an item as DIF, it is difficult to conclude that the grouping variable (e.g.,…

Descriptors: Test Items, Test Bias, Probability, Scores

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

Applied Measurement in…	5
Applied Psychological…	5
Journal of Educational…	5
ProQuest LLC	4
Educational and Psychological…	3
International Journal of…	2
International Journal of…	2
Journal of Applied Testing…	2
Journal of Educational and…	2
Practical Assessment,…	2
Assessment & Evaluation in…	1
Assessment and Evaluation in…	1
College Board	1
ETS Research Report Series	1
Early Education and…	1
Education Journal	1
Educational Assessment	1
Educational Measurement:…	1
Educational Research Quarterly	1
Educational Research and…	1
Higher Education Review	1
History Teacher	1
International Association for…	1
International Journal of…	1
Journal of Autism and…	1
More ▼

Nandakumar, Ratna	3
Wu, Margaret	3
Finch, Holmes	2
He, Yong	2
Kim, Do-Hong	2
Stark, Stephen	2
Wyse, Adam E.	2
Abedi, Jamal	1
Akbari, Alireza	1
Anthony Petrosino	1
Arora, Alka, Ed.	1
Ban, Jae-Chun	1
Barry, Carol	1
Belur, Madhu N.	1
Berger, Martijn P. F.	1
Binkley, Marilyn	1
Borowski, Andreas	1
Boser, Judith A.	1
Bowers, John J.	1
Boyd, Brian T.	1
Burts, Diane C.	1
Carey, Jill	1
Chansarkar, B. A.	1
Chaporkar, Prasanna	1
More ▼