ERIC - Search Results

Publication Date

In 2025	2
Since 2024	6
Since 2021 (last 5 years)	41
Since 2016 (last 10 years)	294
Since 2006 (last 20 years)	608

Descriptor

Statistical Analysis	847
Test Items	847
Item Response Theory	225
Foreign Countries	218
Item Analysis	197
Difficulty Level	177
Test Construction	176
Comparative Analysis	167
Scores	153
Test Bias	141
Correlation	127
Test Validity	120
Test Reliability	108
Multiple Choice Tests	103
Psychometrics	98
Models	97
Goodness of Fit	80
Simulation	78
English (Second Language)	75
Factor Analysis	72
Language Tests	72
Test Format	71
Computation	68
Achievement Tests	67
Computer Assisted Testing	64
More ▼

Education Level

Higher Education	183
Postsecondary Education	144
Secondary Education	94
Elementary Education	68
Middle Schools	42
High Schools	38
Junior High Schools	32
Grade 8	27
Elementary Secondary Education	22
Grade 4	14
Grade 5	13
Grade 6	13
Intermediate Grades	13
Early Childhood Education	10
Grade 3	10
Grade 7	10
Grade 9	9
Primary Education	7
Grade 12	5
Preschool Education	5
Grade 11	3
Grade 2	3
Kindergarten	3
Grade 1	2
Grade 10	2
More ▼

Audience

Researchers	30
Practitioners	3
Teachers	3
Policymakers	1

Location

Turkey	31
Germany	15
Australia	13
Canada	11
Netherlands	11
Japan	9
Taiwan	8
United States	8
Israel	7
Sweden	7
California	6
China	6
Nigeria	6
Singapore	6
Florida	5
India	4
Iran	4
Massachusetts	4
Minnesota	4
Texas	4
United Kingdom	4
United Kingdom (England)	4
Belgium	3
Colorado	3
Hong Kong	3
More ▼

Laws, Policies, & Programs

What Works Clearinghouse Rating

Meets WWC Standards without Reservations	1
Meets WWC Standards with or without Reservations	1

Showing 1 to 15 of 847 results Save | Export

Simultaneous Linear Equating for Scenarios with Optional Test Versions or across Multiple Alternative Anchors

Peer reviewed
PDF on ERIC

Download full text

Tom Benton – Practical Assessment, Research & Evaluation, 2025

This paper proposes an extension of linear equating that may be useful in one of two fairly common assessment scenarios. One is where different students have taken different combinations of test forms. This might occur, for example, where students have some free choice over the exam papers they take within a particular qualification. In this…

Descriptors: Equated Scores, Test Format, Test Items, Computation

Item Parameter Estimation of the 2PL IRT Model with Fixed Ability Estimates: Choices of Ability Estimation Methods and Priors on Slopes

Peer reviewed
PDF on ERIC

Download full text

Jianbin Fu; TsungHan Ho; Xuan Tan – Practical Assessment, Research & Evaluation, 2025

Item parameter estimation using an item response theory (IRT) model with fixed ability estimates is useful in equating with small samples on anchor items. The current study explores the impact of three ability estimation methods (weighted likelihood estimation [WLE], maximum a posteriori [MAP], and posterior ability distribution estimation [PST])…

Descriptors: Item Response Theory, Test Items, Computation, Equated Scores

Investigating Bifactor Models and Fit Indices for Unidimensionality: An Illustration with Method Effects Due to Item Wording

Direct link

Leighton, Elizabeth A. – ProQuest LLC, 2022

The use of unidimensional scales that contain both positively and negatively worded items is common in both the educational and psychological fields. However, dimensionality investigations of these instruments often lead to a rejection of the theorized unidimensional model in favor of multidimensional structures, leaving researchers at odds for…

Descriptors: Test Items, Language Usage, Models, Statistical Analysis

Reevaluating the SIBTEST Classification Heuristics for Dichotomous Differential Item Functioning

Peer reviewed

Direct link

Weese, James D.; Turner, Ronna C.; Ames, Allison; Crawford, Brandon; Liang, Xinya – Educational and Psychological Measurement, 2022

A simulation study was conducted to investigate the heuristics of the SIBTEST procedure and how it compares with ETS classification guidelines used with the Mantel-Haenszel procedure. Prior heuristics have been used for nearly 25 years, but they are based on a simulation study that was restricted due to computer limitations and that modeled item…

Descriptors: Test Bias, Heuristics, Classification, Statistical Analysis

Optimizing Diagnostic Classification Models Application Considering Real-Life Constraints

Peer reviewed

Direct link

Su, Kun; Henson, Robert A. – Journal of Educational and Behavioral Statistics, 2023

This article provides a process to carefully evaluate the suitability of a content domain for which diagnostic classification models (DCMs) could be applicable and then optimized steps for constructing a test blueprint for applying DCMs and a real-life example illustrating this process. The content domains were carefully evaluated using a set of…

Descriptors: Classification, Models, Science Tests, Physics

Evaluating the Effects of Missing Data Handling Methods on Scale Linking Accuracy

Peer reviewed

Direct link

Wu, Tong; Kim, Stella Y.; Westine, Carl – Educational and Psychological Measurement, 2023

For large-scale assessments, data are often collected with missing responses. Despite the wide use of item response theory (IRT) in many testing programs, however, the existing literature offers little insight into the effectiveness of various approaches to handling missing responses in the context of scale linking. Scale linking is commonly used…

Descriptors: Data Analysis, Responses, Statistical Analysis, Measurement

Reconceptualization of Coefficient Alpha Reliability for Test Summed and Scaled Scores

Peer reviewed

Direct link

Almehrizi, Rashid S. – Educational Measurement: Issues and Practice, 2022

Coefficient alpha reliability persists as the most common reliability coefficient reported in research. The assumptions for its use are, however, not well-understood. The current paper challenges the commonly used expressions of coefficient alpha and argues that while these expressions are correct when estimating reliability for summed scores,…

Descriptors: Reliability, Scores, Scaling, Statistical Analysis

Use of the Lagrange Multiplier Test for Assessing Measurement Invariance under Model Misspecification

Peer reviewed

Direct link

Guastadisegni, Lucia; Cagnone, Silvia; Moustaki, Irini; Vasdekis, Vassilis – Educational and Psychological Measurement, 2022

This article studies the Type I error, false positive rates, and power of four versions of the Lagrange multiplier test to detect measurement noninvariance in item response theory (IRT) models for binary data under model misspecification. The tests considered are the Lagrange multiplier test computed with the Hessian and cross-product approach,…

Descriptors: Measurement, Statistical Analysis, Item Response Theory, Test Items

On the Generalized S-X[superscript 2]-Test of Item Fit: Some Variants, Residuals, and a Graphical Visualization

Peer reviewed

Direct link

Ranger, Jochen; Brauer, Kay – Journal of Educational and Behavioral Statistics, 2022

The generalized S-X[superscript 2]-test is a test of item fit for items with polytomous responses format. The test is based on a comparison of the observed and expected number of responses in strata defined by the test score. In this article, we make four contributions. We demonstrate that the performance of the generalized S-X[superscript 2]-test…

Descriptors: Goodness of Fit, Test Items, Statistical Analysis, Item Response Theory

Testing Differential Item Functioning without Predefined Anchor Items Using Robust Regression

Peer reviewed

Direct link

Wang, Weimeng; Liu, Yang; Liu, Hongyun – Journal of Educational and Behavioral Statistics, 2022

Differential item functioning (DIF) occurs when the probability of endorsing an item differs across groups for individuals with the same latent trait level. The presence of DIF items may jeopardize the validity of an instrument; therefore, it is crucial to identify DIF items in routine operations of educational assessment. While DIF detection…

Descriptors: Test Bias, Test Items, Equated Scores, Regression (Statistics)

Assessing Differential Bundle Functioning Using Meta-Analysis

Direct link

Lanrong Li – ProQuest LLC, 2021

When developing a test, it is essential to ensure that the test is free of items with differential item functioning (DIF). DIF occurs when examinees of equal ability, but from different examinee subgroups, have different chances of getting the item correct. According to the multidimensional perspective, DIF occurs because the test measures more…

Descriptors: Test Bias, Test Items, Meta Analysis, Effect Size

Assessing Differential Bundle Functioning Using Meta-Analysis

Peer reviewed

Direct link

Lanrong Li; Betsy Jane Becker – Journal of Educational Measurement, 2021

Differential bundle functioning (DBF) has been proposed to quantify the accumulated amount of differential item functioning (DIF) in an item cluster/bundle (Douglas, Roussos, and Stout). The simultaneous item bias test (SIBTEST, Shealy and Stout) has been used to test for DBF (e.g., Walker, Zhang, and Surber). Research on DBF may have the…

Descriptors: Test Bias, Test Items, Meta Analysis, Effect Size

Detecting Item Preknowledge Using Revisits with Speed and Accuracy

Peer reviewed

Direct link

Demirkaya, Onur; Bezirhan, Ummugul; Zhang, Jinming – Journal of Educational and Behavioral Statistics, 2023

Examinees with item preknowledge tend to obtain inflated test scores that undermine test score validity. With the availability of process data collected in computer-based assessments, the research on detecting item preknowledge has progressed on using both item scores and response times. Item revisit patterns of examinees can also be utilized as…

Descriptors: Test Items, Prior Learning, Knowledge Level, Reaction Time

Finding the Right Grain-Size for Measurement in the Classroom

Peer reviewed

Direct link

Mark Wilson – Journal of Educational and Behavioral Statistics, 2024

This article introduces a new framework for articulating how educational assessments can be related to teacher uses in the classroom. It articulates three levels of assessment: macro (use of standardized tests), meso (externally developed items), and micro (on-the-fly in the classroom). The first level is the usual context for educational…

Descriptors: Educational Assessment, Measurement, Standardized Tests, Test Items

A Comparison of IRT Linking Approaches under the Nonequivalent Groups Anchor Test Design

Direct link

Jiajing Huang – ProQuest LLC, 2022

The nonequivalent-groups anchor-test (NEAT) data-collection design is commonly used in large-scale assessments. Under this design, different test groups take different test forms. Each test form has its own unique items and all test forms share a set of common items. If item response theory (IRT) models are applied to analyze the test data, the…

Descriptors: Item Response Theory, Test Format, Test Items, Test Construction

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 57

Educational and Psychological…	77
ETS Research Report Series	52
ProQuest LLC	34
Applied Psychological…	31
Journal of Educational…	28
Journal of Educational and…	26
Applied Measurement in…	23
International Journal of…	13
Online Submission	12
Psychometrika	12
Language Testing	11
Chemistry Education Research…	10
Grantee Submission	10
Practical Assessment,…	10
International Journal of…	9
Language Assessment Quarterly	9
Educational Measurement:…	8
International Journal of…	8
Journal of Education and…	8
Journal of Psychoeducational…	8
CBE - Life Sciences Education	7
College Entrance Examination…	7
Educational Testing Service	6
Eurasian Journal of…	6
International Journal of…	6
More ▼

Sinharay, Sandip	14
Dorans, Neil J.	8
von Davier, Alina A.	7
Guo, Hongwen	6
Holland, Paul W.	6
Raykov, Tenko	6
Chang, Hua-Hua	5
Kim, Sooyeon	5
Liu, Jinghua	5
Livingston, Samuel A.	5
Magis, David	5
Marcoulides, George A.	5
Reckase, Mark D.	5
Wainer, Howard	5
Wilson, Mark	5
De Boeck, Paul	4
DeMars, Christine E.	4
Dimitrov, Dimiter M.	4
Feigenbaum, Miriam	4
Lee, Yi-Hsuan	4
Robin, Frederic	4
Robitzsch, Alexander	4
Suh, Youngsuk	4
Tindal, Gerald	4
More ▼

Reports - Research	646
Journal Articles	621
Reports - Evaluative	106
Speeches/Meeting Papers	79
Tests/Questionnaires	47
Reports - Descriptive	37
Dissertations/Theses -…	35
Numerical/Quantitative Data	17
Opinion Papers	8
Information Analyses	6
Guides - Non-Classroom	4
Reports - General	3
Collected Works - General	2
Collected Works - Proceedings	2
ERIC Digests in Full Text	2
ERIC Publications	2
Guides - Classroom - Teacher	2
Book/Product Reviews	1
Books	1
Guides - Classroom - Learner	1
Guides - General	1
Reference Materials -…	1
More ▼

SAT (College Admission Test)	23
Program for International…	14
Test of English as a Foreign…	14
Trends in International…	13
Graduate Record Examinations	11
National Assessment of…	9
ACT Assessment	5
Comprehensive Tests of Basic…	4
Iowa Tests of Basic Skills	4
Law School Admission Test	4
Stanford Binet Intelligence…	3
Test of English for…	3
Advanced Placement…	2
Beginning Postsecondary…	2
Florida Comprehensive…	2
International English…	2
Raven Advanced Progressive…	2
United States Medical…	2
Wechsler Adult Intelligence…	2
ACT Interest Inventory	1
Armed Services Vocational…	1
Block Design Test	1
Boehm Test of Basic Concepts	1
California Achievement Tests	1
College Level Examination…	1
More ▼