ERIC - Search Results

Publication Date

In 2025	22
Since 2024	51
Since 2021 (last 5 years)	196
Since 2016 (last 10 years)	452
Since 2006 (last 20 years)	772

Descriptor

Item Response Theory	772
Test Reliability	492
Test Validity	294
Foreign Countries	263
Test Items	236
Psychometrics	220
Reliability	210
Test Construction	170
Scores	157
Measures (Individuals)	131
Validity	100
Correlation	99
Factor Analysis	93
Item Analysis	90
Interrater Reliability	88
Difficulty Level	87
Goodness of Fit	84
Scoring	84
Rating Scales	82
Statistical Analysis	82
Comparative Analysis	76
Models	75
Construct Validity	74
College Students	69
Error of Measurement	69
More ▼

Publication Type

Journal Articles	647
Reports - Research	597
Reports - Evaluative	69
Reports - Descriptive	48
Dissertations/Theses -…	44
Tests/Questionnaires	40
Numerical/Quantitative Data	26
Speeches/Meeting Papers	13
Opinion Papers	5
Guides - Non-Classroom	4
Information Analyses	4
Books	2
Collected Works - General	2
Non-Print Media	2
Reference Materials - General	2
Guides - General	1
Reports -…	1
More ▼

Education Level

Higher Education	202
Postsecondary Education	159
Elementary Education	133
Secondary Education	125
Middle Schools	67
High Schools	66
Junior High Schools	49
Early Childhood Education	42
Elementary Secondary Education	42
Intermediate Grades	41
Primary Education	35
Grade 4	34
Grade 5	31
Grade 8	28
Grade 3	27
Grade 6	26
Grade 7	20
Grade 2	16
Kindergarten	16
Grade 1	13
Grade 9	12
Grade 10	10
Grade 12	6
Adult Education	4
Grade 11	4
More ▼

Audience

Researchers	3
Administrators	1
Practitioners	1

Location

Indonesia	30
Turkey	26
Florida	17
Germany	16
Australia	15
China	15
Malaysia	15
Taiwan	15
United States	12
Hong Kong	11
California	8
Iran	8
New York	8
Singapore	8
South Korea	8
Netherlands	7
United Kingdom	7
United Kingdom (England)	7
Canada	6
Texas	6
Japan	5
Nigeria	5
North Carolina	5
Spain	5
Denmark	4
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001	2
American Recovery and…	1
Elementary and Secondary…	1
Individuals with Disabilities…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 772 results Save | Export

Another Look at Yen's Q3: Is 0.2 an Appropriate Cut-Off?

Peer reviewed

Direct link

Kelsey Nason; Christine DeMars – Journal of Educational Measurement, 2025

This study examined the widely used threshold of 0.2 for Yen's Q3, an index for violations of local independence. Specifically, a simulation was conducted to investigate whether Q3 values were related to the magnitude of bias in estimates of reliability, item parameters, and examinee ability. Results showed that Q3 values below the typical cut-off…

Descriptors: Item Response Theory, Statistical Bias, Test Reliability, Test Items

Using Regularization to Identify Measurement Bias across Multiple Background Characteristics: A Penalized Expectation-Maximization Algorithm

Peer reviewed

Direct link

William C. M. Belzak; Daniel J. Bauer – Journal of Educational and Behavioral Statistics, 2024

Testing for differential item functioning (DIF) has undergone rapid statistical developments recently. Moderated nonlinear factor analysis (MNLFA) allows for simultaneous testing of DIF among multiple categorical and continuous covariates (e.g., sex, age, ethnicity, etc.), and regularization has shown promising results for identifying DIF among…

Descriptors: Test Bias, Algorithms, Factor Analysis, Error of Measurement

Item Response Theory: A Modern Measurement Approach to Reliability and Precision for Counseling Researchers

Peer reviewed

Direct link

Ryan M. Cook; Stefanie A. Wind – Measurement and Evaluation in Counseling and Development, 2024

The purpose of this article is to discuss reliability and precision through the lens of a modern measurement approach, item response theory (IRT). Reliability evidence in the field of counseling is primarily generated using Classical Test Theory (CTT) approaches, although recent studies in the field of counseling have shown the benefits of using…

Descriptors: Item Response Theory, Measurement, Reliability, Accuracy

Detecting Differential Item Functioning among Multiple Groups Using IRT Residual DIF Framework

Peer reviewed

Direct link

Hwanggyu Lim; Danqi Zhu; Edison M. Choe; Kyung T. Han – Journal of Educational Measurement, 2024

This study presents a generalized version of the residual differential item functioning (RDIF) detection framework in item response theory, named GRDIF, to analyze differential item functioning (DIF) in multiple groups. The GRDIF framework retains the advantages of the original RDIF framework, such as computational efficiency and ease of…

Descriptors: Item Response Theory, Test Bias, Test Reliability, Test Construction

Modeling Directional Testlet Effects on Multiple Open-Ended Questions

Peer reviewed

Direct link

Kuan-Yu Jin; Wai-Lok Siu – Journal of Educational Measurement, 2025

Educational tests often have a cluster of items linked by a common stimulus ("testlet"). In such a design, the dependencies caused between items are called "testlet effects." In particular, the directional testlet effect (DTE) refers to a recursive influence whereby responses to earlier items can positively or negatively affect…

Descriptors: Models, Test Items, Educational Assessment, Scores

Exploring the Effects of Collapsing Rating Scale Categories in Polytomous Item Response Theory Analyses: An Illustration and Simulation Study

Peer reviewed

Direct link

Chia-Lin Tsai; Stefanie Wind; Samantha Estrada – Measurement: Interdisciplinary Research and Perspectives, 2025

Researchers who work with ordinal rating scales sometimes encounter situations where the scale categories do not function in the intended or expected way. For example, participants' use of scale categories may result in an empirical difficulty ordering for the categories that does not match what was intended. Likewise, the level of distinction…

Descriptors: Rating Scales, Item Response Theory, Psychometrics, Self Efficacy

Comparing and Combining IRTree Models and Anchoring Vignettes in Addressing Response Styles

Peer reviewed

Direct link

Mingfeng Xue; Ping Chen – Journal of Educational Measurement, 2025

Response styles pose great threats to psychological measurements. This research compares IRTree models and anchoring vignettes in addressing response styles and estimating the target traits. It also explores the potential of combining them at the item level and total-score level (ratios of extreme and middle responses to vignettes). Four models…

Descriptors: Item Response Theory, Models, Comparative Analysis, Vignettes

Communal Factors in Rater Severity and Consistency over Time in High-Stakes Oral Assessment

Peer reviewed

Direct link

Reeta Neittaanmäki; Iasonas Lamprianou – Language Testing, 2024

This article focuses on rater severity and consistency and their relation to major changes in the rating system in a high-stakes testing context. The study is based on longitudinal data collected from 2009 to 2019 from the second language (L2) Finnish speaking subtest in the National Certificates of Language Proficiency in Finland. We investigated…

Descriptors: Foreign Countries, Interrater Reliability, Evaluators, Item Response Theory

Developing a Tool for Measuring Student Orientations with Respect to Understanding in Mathematical Learning

Peer reviewed
PDF on ERIC

Download full text

Siqi Huang – North American Chapter of the International Group for the Psychology of Mathematics Education, 2023

The goal of this paper is twofold. First, the paper clarifies and elaborates on an important theoretical construct called orientation with respect to understanding in mathematics, which denotes the degree to which students exhibit an inclination towards and demonstrate an earnest concern for understanding in mathematical learning. Second, the…

Descriptors: Mathematics Instruction, Teaching Methods, Problem Solving, Reliability

Reliability of the Commonly Used and Newly-Developed Autism Measures

Peer reviewed

Direct link

Thomas W. Frazier; Andrew J. O. Whitehouse; Susan R. Leekam; Sarah J. Carrington; Gail A. Alvares; David W. Evans; Antonio Y. Hardan; Mirko Uljarevic – Journal of Autism and Developmental Disorders, 2024

Purpose: The aim of the present study was to compare scale and conditional reliability derived from item response theory analyses among the most commonly used, as well as several newly developed, observation, interview, and parent-report autism instruments. Methods: When available, data sets were combined to facilitate large sample evaluation.…

Descriptors: Test Reliability, Item Response Theory, Autism Spectrum Disorders, Clinical Diagnosis

Exploring the Influence of Response Styles on Continuous Scale Assessments: Insights from a Novel Modeling Approach

Peer reviewed

Direct link

Hung-Yu Huang – Educational and Psychological Measurement, 2025

The use of discrete categorical formats to assess psychological traits has a long-standing tradition that is deeply embedded in item response theory models. The increasing prevalence and endorsement of computer- or web-based testing has led to greater focus on continuous response formats, which offer numerous advantages in both respondent…

Descriptors: Response Style (Tests), Psychological Characteristics, Item Response Theory, Test Reliability

Comparative Evaluation of C-Test Reliability Using Classical and Modern Psychometric Methods

Peer reviewed
PDF on ERIC

Download full text

Neda Kianinezhad; Mohsen Kianinezhad – Language Education & Assessment, 2025

This study presents a comparative analysis of classical reliability measures, including Cronbach's alpha, test-retest, and parallel forms reliability, alongside modern psychometric methods such as the Rasch model and Mokken scaling, to evaluate the reliability of C-tests in language proficiency assessment. Utilizing data from 150 participants…

Descriptors: Psychometrics, Test Reliability, Language Proficiency, Language Tests

The Sensitivity of Value-Added Estimates to Test Scoring Decisions. EdWorkingPaper No. 25-1226

Download full text

Joshua B. Gilbert; James G. Soland; Benjamin W. Domingue – Annenberg Institute for School Reform at Brown University, 2025

Value-Added Models (VAMs) are both common and controversial in education policy and accountability research. While the sensitivity of VAMs to model specification and covariate selection is well documented, the extent to which test scoring methods (e.g., mean scores vs. IRT-based scores) may affect VA estimates is less studied. We examine the…

Descriptors: Value Added Models, Tests, Testing, Scoring

Empowering Digital Transformation: Developing and Validating a Digital Leadership Scale through Rasch Model Analysis

Peer reviewed

Direct link

Novina Sabila Zahra; Hillman Wirawan – Measurement: Interdisciplinary Research and Perspectives, 2025

Technology development has triggered digital transformation in various organizations, influencing work processes, communication, and innovation. Digital leadership plays a crucial role in directing and managing this transformation. This research aims to develop a new measurement tool for assessing digital leadership using the Rasch Model for…

Descriptors: Leadership, Measures (Individuals), Test Validity, Item Response Theory

Properties and Performance of the One-Parameter Log-Linear Cognitive Diagnosis Model

Peer reviewed

Direct link

Lientje Maas; Matthew J. Madison; Matthieu J. S. Brinkhuis – Grantee Submission, 2024

Diagnostic classification models (DCMs) are psychometric models that yield probabilistic classifications of respondents according to a set of discrete latent variables. The current study examines the recently introduced one-parameter log-linear cognitive diagnosis model (1-PLCDM), which has increased interpretability compared with general DCMs due…

Descriptors: Clinical Diagnosis, Classification, Models, Psychometrics

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 52

ProQuest LLC	44
Educational and Psychological…	32
Grantee Submission	23
Journal of Educational…	20
Journal of Psychoeducational…	19
Language Testing	18
Online Submission	16
ETS Research Report Series	15
Applied Measurement in…	14
International Journal of…	14
SAGE Open	14
Measurement and Evaluation in…	12
Educational Measurement:…	11
International Journal of…	10
Applied Psychological…	9
Measurement:…	9
International Journal of…	8
Behavioral Research and…	7
Cogent Education	7
Educational Assessment	7
International Journal of…	7
Journal of Baltic Science…	7
Assessment for Effective…	6
Eurasian Journal of…	6
European Journal of…	6
More ▼

Petscher, Yaacov	11
Schoen, Robert C.	9
Alonzo, Julie	7
Tindal, Gerald	7
Wang, Wen-Chung	7
Wind, Stefanie A.	7
Foorman, Barbara R.	6
Johnson, Evelyn S.	6
Moylan, Laura A.	6
Zheng, Yuzhu	6
Anderson, Daniel	5
Boone, William J.	5
Crawford, Angela R.	5
Haberman, Shelby J.	5
Lee, Won-Chan	5
Bauduin, Charity	4
Biancarosa, Gina	4
Carlson, Sarah E.	4
Davison, Mark L.	4
Irvin, P. Shawn	4
Lai, Cheng-Fei	4
Lee, Young-Sun	4
Liu, Bowen	4
Liu, Ou Lydia	4
Paek, Insu	4
More ▼

Program for International…	6
Trends in International…	6
Iowa Tests of Basic Skills	5
Peabody Picture Vocabulary…	5
Stanford Achievement Tests	5
SAT (College Admission Test)	4
ACT Assessment	3
Graduate Record Examinations	3
Motivated Strategies for…	3
Raven Progressive Matrices	3
Test of English as a Foreign…	3
Dynamic Indicators of Basic…	2
Early Childhood Longitudinal…	2
Florida Comprehensive…	2
Gates MacGinitie Reading Tests	2
Home Observation for…	2
Measures of Academic Progress	2
National Assessment of…	2
Remote Associates Test	2
Student Teacher Relationship…	2
Wechsler Adult Intelligence…	2
Wechsler Preschool and…	2
Advanced Placement…	1
Armed Forces Qualification…	1
Autism Diagnostic Observation…	1
More ▼