ERIC - Search Results

Publication Date

In 2025	0
Since 2024	10
Since 2021 (last 5 years)	30
Since 2016 (last 10 years)	65
Since 2006 (last 20 years)	133

Descriptor

Test Items	231
Item Response Theory	86
Difficulty Level	45
Test Construction	45
Multiple Choice Tests	39
Mathematics Tests	38
Scores	36
Test Bias	34
Comparative Analysis	31
Test Format	31
Simulation	28
Achievement Tests	25
Computer Assisted Testing	24
Foreign Countries	24
Item Analysis	24
Statistical Analysis	23
Psychometrics	22
Computation	21
Models	21
Responses	21
Classification	19
Error of Measurement	19
Evaluation Methods	19
Scoring	19
Higher Education	18
More ▼

Source

Applied Measurement in…

231

Publication Type

Journal Articles	231
Reports - Research	162
Reports - Evaluative	63
Information Analyses	8
Speeches/Meeting Papers	8
Reports - Descriptive	3
Opinion Papers	1
Tests/Questionnaires	1

Education Level

Secondary Education	23
Elementary Secondary Education	18
Elementary Education	17
Middle Schools	15
Grade 5	12
Higher Education	12
Junior High Schools	12
Grade 8	11
High Schools	11
Postsecondary Education	11
Grade 3	7
Grade 4	6
Grade 7	6
Intermediate Grades	6
Grade 6	4
Primary Education	4
Early Childhood Education	3
Grade 11	3
Grade 10	2
Grade 9	2
Grade 1	1
Grade 12	1
Grade 2	1
More ▼

Audience

Practitioners

Location

Canada	6
Arizona	3
California	3
Germany	3
Massachusetts	3
United Kingdom	3
Indiana	2
Israel	2
New York	2
Ohio	2
Australia	1
Belgium	1
Finland	1
Florida	1
France	1
Georgia	1
Hawaii	1
Idaho	1
Iowa	1
Iran	1
Italy	1
Japan	1
Jordan	1
Kansas	1
Michigan	1
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Program for International…	6
National Assessment of…	5
Trends in International…	5
Graduate Record Examinations	4
SAT (College Admission Test)	3
Iowa Tests of Basic Skills	2
ACT Assessment	1
Advanced Placement…	1
Iowa Tests of Educational…	1
Law School Admission Test	1
Major Field Achievement Test…	1
Measures of Academic Progress	1
Praxis Series	1
Preliminary Scholastic…	1
Progress in International…	1
TerraNova Multiple Assessments	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 231 results Save | Export

Item-Writing Guidelines on Response Option Placement: A Systematic Review

Peer reviewed

Direct link

Séverin Lions; María Paz Blanco; Pablo Dartnell; Carlos Monsalve; Gabriel Ortega; Julie Lemarié – Applied Measurement in Education, 2024

Multiple-choice items are universally used in formal education. Since they should assess learning, not test-wiseness or guesswork, they must be constructed following the highest possible standards. Hundreds of item-writing guides have provided guidelines to help test developers adopt appropriate strategies to define the distribution and sequence…

Descriptors: Test Construction, Multiple Choice Tests, Guidelines, Test Items

Using Content Relevance and Representativeness Indices in Instrument Revision

Peer reviewed

Direct link

Anne Traynor; Sara C. Christopherson – Applied Measurement in Education, 2024

Combining methods from earlier content validity and more contemporary content alignment studies may allow a more complete evaluation of the meaning of test scores than if either set of methods is used on its own. This article distinguishes item relevance indices in the content validity literature from test representativeness indices in the…

Descriptors: Test Validity, Test Items, Achievement Tests, Test Construction

The Impact of Non-Effortful Responding on Item and Person Parameters in Item-Pool Scaling Linking

Peer reviewed

Direct link

Yue Liu; Zhen Li; Hongyun Liu; Xiaofeng You – Applied Measurement in Education, 2024

Low test-taking effort of examinees has been considered a source of construct-irrelevant variance in item response modeling, leading to serious consequences on parameter estimation. This study aims to investigate how non-effortful response (NER) influences the estimation of item and person parameters in item-pool scale linking (IPSL) and whether…

Descriptors: Item Response Theory, Computation, Simulation, Responses

Modeling Dimensions Converging at the Upper Anchor in Learning Progressions: An Example of Micro-Evolution

Peer reviewed

Direct link

Mingfeng Xue; Mark Wilson – Applied Measurement in Education, 2024

Multidimensionality is common in psychological and educational measurements. This study focuses on dimensions that converge at the upper anchor (i.e. the highest acquisition status defined in a learning progression) and compares different ways of dealing with them using the multidimensional random coefficients multinomial logit model and scale…

Descriptors: Learning Trajectories, Educational Assessment, Item Response Theory, Evolution

An Examination of Individual Ability Estimation and Classification Accuracy under Rapid Guessing Misidentifications

Peer reviewed

Direct link

Rios, Joseph – Applied Measurement in Education, 2022

To mitigate the deleterious effects of rapid guessing (RG) on ability estimates, several rescoring procedures have been proposed. Underlying many of these procedures is the assumption that RG is accurately identified. At present, there have been minimal investigations examining the utility of rescoring approaches when RG is misclassified, and…

Descriptors: Accuracy, Guessing (Tests), Scoring, Classification

Bayesian Logistic Regression: A New Method to Calibrate Pretest Items in Multistage Adaptive Testing

Peer reviewed

Direct link

TsungHan Ho – Applied Measurement in Education, 2023

An operational multistage adaptive test (MST) requires the development of a large item bank and the effort to continuously replenish the item bank due to concerns about test security and validity over the long term. New items should be pretested and linked to the item bank before being used operationally. The linking item volume fluctuations in…

Descriptors: Bayesian Statistics, Regression (Statistics), Test Items, Pretesting

Multi-Group Generalizations of SIBTEST and Crossing-SIBTEST

Peer reviewed

Direct link

Chalmers, R. Philip; Zheng, Guoguo – Applied Measurement in Education, 2023

This article presents generalizations of SIBTEST and crossing-SIBTEST statistics for differential item functioning (DIF) investigations involving more than two groups. After reviewing the original two-group setup for these statistics, a set of multigroup generalizations that support contrast matrices for joint tests of DIF are presented. To…

Descriptors: Test Bias, Test Items, Item Response Theory, Error of Measurement

Detecting Item Parameter Drift in Small Sample Rasch Equating

Peer reviewed

Direct link

Daniel Jurich; Chunyan Liu – Applied Measurement in Education, 2023

Screening items for parameter drift helps protect against serious validity threats and ensure score comparability when equating forms. Although many high-stakes credentialing examinations operate with small sample sizes, few studies have investigated methods to detect drift in small sample equating. This study demonstrates that several newly…

Descriptors: High Stakes Tests, Sample Size, Item Response Theory, Equated Scores

Identifying Careless Responses in Computer-Adaptive Affective Surveys Using Person Fit Analysis

Peer reviewed

Direct link

Stefanie A. Wind; Beyza Aksu-Dunya – Applied Measurement in Education, 2024

Careless responding is a pervasive concern in research using affective surveys. Although researchers have considered various methods for identifying careless responses, studies are limited that consider the utility of these methods in the context of computer adaptive testing (CAT) for affective scales. Using a simulation study informed by recent…

Descriptors: Response Style (Tests), Computer Assisted Testing, Adaptive Testing, Affective Measures

IRT Characteristic Curve Linking Methods Weighted by Information for Mixed-Format Tests

Peer reviewed

Direct link

Shaojie Wang; Won-Chan Lee; Minqiang Zhang; Lixin Yuan – Applied Measurement in Education, 2024

To reduce the impact of parameter estimation errors on IRT linking results, recent work introduced two information-weighted characteristic curve methods for dichotomous items. These two methods showed outstanding performance in both simulation and pseudo-form pseudo-group analysis. The current study expands upon the concept of information…

Descriptors: Item Response Theory, Test Format, Test Length, Error of Measurement

Violation of Conditional Independence in the Many-Facets Rasch Model

Peer reviewed

Direct link

DeMars, Christine E. – Applied Measurement in Education, 2021

Estimation of parameters for the many-facets Rasch model requires that conditional on the values of the facets, such as person ability, item difficulty, and rater severity, the observed responses within each facet are independent. This requirement has often been discussed for the Rasch models and 2PL and 3PL models, but it becomes more complex…

Descriptors: Item Response Theory, Test Items, Ability, Scores

Comparing Drift Detection Methods for Accurate Rasch Equating in Different Sample Sizes

Peer reviewed

Direct link

Alahmadi, Sarah; Jones, Andrew T.; Barry, Carol L.; Ibáñez, Beatriz – Applied Measurement in Education, 2023

Rasch common-item equating is often used in high-stakes testing to maintain equivalent passing standards across test administrations. If unaddressed, item parameter drift poses a major threat to the accuracy of Rasch common-item equating. We compared the performance of well-established and newly developed drift detection methods in small and large…

Descriptors: Equated Scores, Item Response Theory, Sample Size, Test Items

Traditional vs Intersectional DIF Analysis: Considerations and a Comparison Using State Testing Data

Peer reviewed

Direct link

Tony Albano; Brian F. French; Thao Thu Vo – Applied Measurement in Education, 2024

Recent research has demonstrated an intersectional approach to the study of differential item functioning (DIF). This approach expands DIF to account for the interactions between what have traditionally been treated as separate grouping variables. In this paper, we compare traditional and intersectional DIF analyses using data from a state testing…

Descriptors: Test Items, Item Analysis, Data Use, Standardized Tests

The Impact of Test-Taking Disengagement on Item Content Representation

Peer reviewed

Direct link

Wise, Steven L. – Applied Measurement in Education, 2020

In achievement testing there is typically a practical requirement that the set of items administered should be representative of some target content domain. This is accomplished by establishing test blueprints specifying the content constraints to be followed when selecting the items for a test. Sometimes, however, students give disengaged…

Descriptors: Test Items, Test Content, Achievement Tests, Guessing (Tests)

Maintaining Score Scales over Time: A Comparison of Five Scoring Methods

Peer reviewed

Direct link

Kim, Stella Yun; Lee, Won-Chan – Applied Measurement in Education, 2023

This study evaluates various scoring methods including number-correct scoring, IRT theta scoring, and hybrid scoring in terms of scale-score stability over time. A simulation study was conducted to examine the relative performance of five scoring methods in terms of preserving the first two moments of scale scores for a population in a chain of…

Descriptors: Scoring, Comparative Analysis, Item Response Theory, Simulation

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 16

Sireci, Stephen G.	6
Wise, Steven L.	6
Downing, Steven M.	5
Haladyna, Thomas M.	5
Penfield, Randall D.	5
Wells, Craig S.	5
DeMars, Christine E.	4
Ercikan, Kadriye	4
Gierl, Mark J.	4
Hambleton, Ronald K.	4
Lee, Won-Chan	4
Bennett, Randy Elliot	3
Bolt, Daniel M.	3
D'Agostino, Jerome V.	3
Engelhard, George, Jr.	3
Frary, Robert B.	3
Lane, Suzanne	3
Meijer, Rob R.	3
Rodriguez, Michael C.	3
Banks, Kathleen	2
Carney, Michele	2
Chang, Lei	2
Cohen, Allan S.	2
Cohen, Dale J.	2
More ▼