ERIC - Search Results

Publication Date

In 2025	3
Since 2024	7
Since 2021 (last 5 years)	43
Since 2016 (last 10 years)	296
Since 2006 (last 20 years)	610

Descriptor

Statistical Analysis	849
Test Items	849
Item Response Theory	226
Foreign Countries	218
Item Analysis	198
Difficulty Level	177
Test Construction	176
Comparative Analysis	167
Scores	153
Test Bias	143
Correlation	127
Test Validity	120
Test Reliability	108
Multiple Choice Tests	104
Psychometrics	98
Models	97
Goodness of Fit	80
Simulation	79
English (Second Language)	75
Factor Analysis	72
Language Tests	72
Test Format	71
Computation	68
Achievement Tests	67
Computer Assisted Testing	64
More ▼

Education Level

Higher Education	184
Postsecondary Education	145
Secondary Education	94
Elementary Education	68
Middle Schools	42
High Schools	38
Junior High Schools	32
Grade 8	27
Elementary Secondary Education	22
Grade 4	14
Grade 5	13
Grade 6	13
Intermediate Grades	13
Early Childhood Education	10
Grade 3	10
Grade 7	10
Grade 9	9
Primary Education	7
Grade 12	5
Preschool Education	5
Grade 11	3
Grade 2	3
Kindergarten	3
Grade 1	2
Grade 10	2
More ▼

Audience

Researchers	30
Practitioners	3
Teachers	3
Policymakers	1

Location

Turkey	31
Germany	15
Australia	13
Canada	11
Netherlands	11
Japan	9
Taiwan	8
United States	8
Israel	7
Sweden	7
California	6
China	6
Nigeria	6
Singapore	6
Florida	5
India	4
Iran	4
Massachusetts	4
Minnesota	4
Texas	4
United Kingdom	4
United Kingdom (England)	4
Belgium	3
Colorado	3
Hong Kong	3
More ▼

Laws, Policies, & Programs

What Works Clearinghouse Rating

Meets WWC Standards without Reservations	1
Meets WWC Standards with or without Reservations	1

Showing 1 to 15 of 849 results Save | Export

A Comparison of Anchor Selection Strategies for DIF Analysis

Peer reviewed

Direct link

Haeju Lee; Kyung Yong Kim – Journal of Educational Measurement, 2025

When no prior information of differential item functioning (DIF) exists for items in a test, either the rank-based or iterative purification procedure might be preferred. The rank-based purification selects anchor items based on a preliminary DIF test. For a preliminary DIF test, likelihood ratio test (LRT) based approaches (e.g.,…

Descriptors: Test Items, Equated Scores, Test Bias, Accuracy

Simultaneous Linear Equating for Scenarios with Optional Test Versions or across Multiple Alternative Anchors

Peer reviewed
PDF on ERIC

Download full text

Tom Benton – Practical Assessment, Research & Evaluation, 2025

This paper proposes an extension of linear equating that may be useful in one of two fairly common assessment scenarios. One is where different students have taken different combinations of test forms. This might occur, for example, where students have some free choice over the exam papers they take within a particular qualification. In this…

Descriptors: Equated Scores, Test Format, Test Items, Computation

Impacts of Differences in Group Abilities and Anchor Test Features on Three Non-IRT Test Equating Methods

Peer reviewed
PDF on ERIC

Download full text

Inga Laukaityte; Marie Wiberg – Practical Assessment, Research & Evaluation, 2024

The overall aim was to examine effects of differences in group ability and features of the anchor test form on equating bias and the standard error of equating (SEE) using both real and simulated data. Chained kernel equating, Postratification kernel equating, and Circle-arc equating were studied. A college admissions test with four different…

Descriptors: Ability Grouping, Test Items, College Entrance Examinations, High Stakes Tests

A Comparison of IRT Linking Approaches under the Nonequivalent Groups Anchor Test Design

Direct link

Jiajing Huang – ProQuest LLC, 2022

The nonequivalent-groups anchor-test (NEAT) data-collection design is commonly used in large-scale assessments. Under this design, different test groups take different test forms. Each test form has its own unique items and all test forms share a set of common items. If item response theory (IRT) models are applied to analyze the test data, the…

Descriptors: Item Response Theory, Test Format, Test Items, Test Construction

An Introduction to Statistical Techniques Used for Detecting Anomaly in Test Results

Peer reviewed

Direct link

He, Qingping; Meadows, Michelle; Black, Beth – Research Papers in Education, 2022

A potential negative consequence of high-stakes testing is inappropriate test behaviour involving individuals and/or institutions. Inappropriate test behaviour and test collusion can result in aberrant response patterns and anomalous test scores and invalidate the intended interpretation and use of test results. A variety of statistical techniques…

Descriptors: Statistical Analysis, High Stakes Tests, Scores, Response Style (Tests)

Optimizing Diagnostic Classification Models Application Considering Real-Life Constraints

Peer reviewed

Direct link

Su, Kun; Henson, Robert A. – Journal of Educational and Behavioral Statistics, 2023

This article provides a process to carefully evaluate the suitability of a content domain for which diagnostic classification models (DCMs) could be applicable and then optimized steps for constructing a test blueprint for applying DCMs and a real-life example illustrating this process. The content domains were carefully evaluated using a set of…

Descriptors: Classification, Models, Science Tests, Physics

Reevaluating the SIBTEST Classification Heuristics for Dichotomous Differential Item Functioning

Peer reviewed

Direct link

Weese, James D.; Turner, Ronna C.; Ames, Allison; Crawford, Brandon; Liang, Xinya – Educational and Psychological Measurement, 2022

A simulation study was conducted to investigate the heuristics of the SIBTEST procedure and how it compares with ETS classification guidelines used with the Mantel-Haenszel procedure. Prior heuristics have been used for nearly 25 years, but they are based on a simulation study that was restricted due to computer limitations and that modeled item…

Descriptors: Test Bias, Heuristics, Classification, Statistical Analysis

Mean Comparisons of Many Groups in the Presence of DIF: An Evaluation of Linking and Concurrent Scaling Approaches

Peer reviewed

Direct link

Robitzsch, Alexander; Lüdtke, Oliver – Journal of Educational and Behavioral Statistics, 2022

One of the primary goals of international large-scale assessments in education is the comparison of country means in student achievement. This article introduces a framework for discussing differential item functioning (DIF) for such mean comparisons. We compare three different linking methods: concurrent scaling based on full invariance,…

Descriptors: Test Bias, International Assessment, Scaling, Comparative Analysis

Finding the Right Grain-Size for Measurement in the Classroom

Peer reviewed

Direct link

Mark Wilson – Journal of Educational and Behavioral Statistics, 2024

This article introduces a new framework for articulating how educational assessments can be related to teacher uses in the classroom. It articulates three levels of assessment: macro (use of standardized tests), meso (externally developed items), and micro (on-the-fly in the classroom). The first level is the usual context for educational…

Descriptors: Educational Assessment, Measurement, Standardized Tests, Test Items

Detecting Item Preknowledge Using Revisits with Speed and Accuracy

Peer reviewed

Direct link

Demirkaya, Onur; Bezirhan, Ummugul; Zhang, Jinming – Journal of Educational and Behavioral Statistics, 2023

Examinees with item preknowledge tend to obtain inflated test scores that undermine test score validity. With the availability of process data collected in computer-based assessments, the research on detecting item preknowledge has progressed on using both item scores and response times. Item revisit patterns of examinees can also be utilized as…

Descriptors: Test Items, Prior Learning, Knowledge Level, Reaction Time

Designing and Evaluating Tasks to Measure Individual Differences in Experimental Psychology: A Tutorial

Peer reviewed

Direct link

Marc Brysbaert – Cognitive Research: Principles and Implications, 2024

Experimental psychology is witnessing an increase in research on individual differences, which requires the development of new tasks that can reliably assess variations among participants. To do this, cognitive researchers need statistical methods that many researchers have not learned during their training. The lack of expertise can pose…

Descriptors: Experimental Psychology, Individual Differences, Statistical Analysis, Task Analysis

Testing Differential Item Functioning without Predefined Anchor Items Using Robust Regression

Peer reviewed

Direct link

Wang, Weimeng; Liu, Yang; Liu, Hongyun – Journal of Educational and Behavioral Statistics, 2022

Differential item functioning (DIF) occurs when the probability of endorsing an item differs across groups for individuals with the same latent trait level. The presence of DIF items may jeopardize the validity of an instrument; therefore, it is crucial to identify DIF items in routine operations of educational assessment. While DIF detection…

Descriptors: Test Bias, Test Items, Equated Scores, Regression (Statistics)

Comparing Drift Detection Methods for Accurate Rasch Equating in Different Sample Sizes

Peer reviewed

Direct link

Alahmadi, Sarah; Jones, Andrew T.; Barry, Carol L.; Ibáñez, Beatriz – Applied Measurement in Education, 2023

Rasch common-item equating is often used in high-stakes testing to maintain equivalent passing standards across test administrations. If unaddressed, item parameter drift poses a major threat to the accuracy of Rasch common-item equating. We compared the performance of well-established and newly developed drift detection methods in small and large…

Descriptors: Equated Scores, Item Response Theory, Sample Size, Test Items

Evaluating the Effects of Analytical Decisions in Large-Scale Assessments: Analyzing PISA Mathematics 2003-2012

Peer reviewed

Direct link

Heine, Jörg-Henrik; Robitzsch, Alexander – Large-scale Assessments in Education, 2022

Research Question: This paper examines the overarching question of to what extent different analytic choices may influence the inference about country-specific cross-sectional and trend estimates in international large-scale assessments. We take data from the assessment of PISA mathematics proficiency from the four rounds from 2003 to 2012 as a…

Descriptors: Foreign Countries, International Assessment, Achievement Tests, Secondary School Students

Assessing Differential Bundle Functioning Using Meta-Analysis

Direct link

Lanrong Li – ProQuest LLC, 2021

When developing a test, it is essential to ensure that the test is free of items with differential item functioning (DIF). DIF occurs when examinees of equal ability, but from different examinee subgroups, have different chances of getting the item correct. According to the multidimensional perspective, DIF occurs because the test measures more…

Descriptors: Test Bias, Test Items, Meta Analysis, Effect Size

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 57

Educational and Psychological…	77
ETS Research Report Series	52
ProQuest LLC	34
Applied Psychological…	31
Journal of Educational…	29
Journal of Educational and…	26
Applied Measurement in…	23
International Journal of…	13
Online Submission	12
Psychometrika	12
Language Testing	11
Chemistry Education Research…	10
Grantee Submission	10
Practical Assessment,…	10
International Journal of…	9
Language Assessment Quarterly	9
Educational Measurement:…	8
International Journal of…	8
Journal of Education and…	8
Journal of Psychoeducational…	8
CBE - Life Sciences Education	7
College Entrance Examination…	7
Educational Testing Service	6
Eurasian Journal of…	6
International Journal of…	6
More ▼

Sinharay, Sandip	14
Dorans, Neil J.	8
von Davier, Alina A.	7
Guo, Hongwen	6
Holland, Paul W.	6
Raykov, Tenko	6
Chang, Hua-Hua	5
Kim, Sooyeon	5
Liu, Jinghua	5
Livingston, Samuel A.	5
Magis, David	5
Marcoulides, George A.	5
Reckase, Mark D.	5
Wainer, Howard	5
Wilson, Mark	5
De Boeck, Paul	4
DeMars, Christine E.	4
Dimitrov, Dimiter M.	4
Feigenbaum, Miriam	4
Lee, Yi-Hsuan	4
Robin, Frederic	4
Robitzsch, Alexander	4
Suh, Youngsuk	4
Tindal, Gerald	4
More ▼

Reports - Research	648
Journal Articles	623
Reports - Evaluative	106
Speeches/Meeting Papers	79
Tests/Questionnaires	47
Reports - Descriptive	37
Dissertations/Theses -…	35
Numerical/Quantitative Data	17
Opinion Papers	8
Information Analyses	6
Guides - Non-Classroom	4
Reports - General	3
Collected Works - General	2
Collected Works - Proceedings	2
ERIC Digests in Full Text	2
ERIC Publications	2
Guides - Classroom - Teacher	2
Book/Product Reviews	1
Books	1
Guides - Classroom - Learner	1
Guides - General	1
Reference Materials -…	1
More ▼

SAT (College Admission Test)	23
Program for International…	14
Test of English as a Foreign…	14
Trends in International…	13
Graduate Record Examinations	11
National Assessment of…	9
ACT Assessment	5
Comprehensive Tests of Basic…	4
Iowa Tests of Basic Skills	4
Law School Admission Test	4
Stanford Binet Intelligence…	3
Test of English for…	3
Advanced Placement…	2
Beginning Postsecondary…	2
Florida Comprehensive…	2
International English…	2
Raven Advanced Progressive…	2
United States Medical…	2
Wechsler Adult Intelligence…	2
ACT Interest Inventory	1
Armed Services Vocational…	1
Block Design Test	1
Boehm Test of Basic Concepts	1
California Achievement Tests	1
College Level Examination…	1
More ▼