ERIC - Search Results

Publication Date

In 2025	15
Since 2024	36
Since 2021 (last 5 years)	152
Since 2016 (last 10 years)	339
Since 2006 (last 20 years)	488

Descriptor

Item Response Theory	546
Test Reliability	546
Test Validity	302
Test Items	214
Foreign Countries	185
Test Construction	171
Psychometrics	170
Scores	104
Measures (Individuals)	80
Difficulty Level	78
Factor Analysis	65
Goodness of Fit	62
Item Analysis	61
Scoring	59
Construct Validity	58
Computer Assisted Testing	55
Test Bias	55
Elementary School Students	53
Correlation	51
Multiple Choice Tests	51
College Students	49
Statistical Analysis	49
Error of Measurement	46
Rating Scales	45
Test Theory	44
More ▼

Publication Type

Journal Articles	429
Reports - Research	407
Reports - Evaluative	67
Reports - Descriptive	37
Speeches/Meeting Papers	32
Tests/Questionnaires	27
Numerical/Quantitative Data	24
Dissertations/Theses -…	22
Collected Works - General	4
Information Analyses	4
Opinion Papers	4
Books	3
Guides - Non-Classroom	3
Non-Print Media	2
Reference Materials - General	2
Guides - General	1
Reports -…	1
More ▼

Education Level

Higher Education	130
Postsecondary Education	104
Secondary Education	97
Elementary Education	96
Middle Schools	54
High Schools	51
Junior High Schools	42
Early Childhood Education	35
Elementary Secondary Education	30
Intermediate Grades	29
Primary Education	29
Grade 4	25
Grade 5	23
Grade 8	21
Grade 3	20
Grade 6	18
Grade 7	15
Kindergarten	13
Grade 1	11
Grade 2	11
Grade 9	11
Grade 10	6
Adult Education	3
Grade 11	3
Grade 12	3
More ▼

Audience

Researchers	2
Administrators	1
Practitioners	1

Location

Indonesia	27
Malaysia	15
Germany	13
Turkey	13
Taiwan	12
United States	12
Florida	11
China	9
Australia	8
New York	8
Hong Kong	7
Iran	6
Texas	6
California	5
Nigeria	5
South Korea	5
Canada	4
Denmark	4
Netherlands	4
North Carolina	4
Singapore	4
South Africa	4
United Kingdom	4
United Kingdom (England)	4
Utah	4
More ▼

Laws, Policies, & Programs

Individuals with Disabilities…	1
No Child Left Behind Act 2001	1

What Works Clearinghouse Rating

Showing 1 to 15 of 546 results Save | Export

Another Look at Yen's Q3: Is 0.2 an Appropriate Cut-Off?

Peer reviewed

Direct link

Kelsey Nason; Christine DeMars – Journal of Educational Measurement, 2025

This study examined the widely used threshold of 0.2 for Yen's Q3, an index for violations of local independence. Specifically, a simulation was conducted to investigate whether Q3 values were related to the magnitude of bias in estimates of reliability, item parameters, and examinee ability. Results showed that Q3 values below the typical cut-off…

Descriptors: Item Response Theory, Statistical Bias, Test Reliability, Test Items

Detecting Differential Item Functioning among Multiple Groups Using IRT Residual DIF Framework

Peer reviewed

Direct link

Hwanggyu Lim; Danqi Zhu; Edison M. Choe; Kyung T. Han – Journal of Educational Measurement, 2024

This study presents a generalized version of the residual differential item functioning (RDIF) detection framework in item response theory, named GRDIF, to analyze differential item functioning (DIF) in multiple groups. The GRDIF framework retains the advantages of the original RDIF framework, such as computational efficiency and ease of…

Descriptors: Item Response Theory, Test Bias, Test Reliability, Test Construction

Modeling Directional Testlet Effects on Multiple Open-Ended Questions

Peer reviewed

Direct link

Kuan-Yu Jin; Wai-Lok Siu – Journal of Educational Measurement, 2025

Educational tests often have a cluster of items linked by a common stimulus ("testlet"). In such a design, the dependencies caused between items are called "testlet effects." In particular, the directional testlet effect (DTE) refers to a recursive influence whereby responses to earlier items can positively or negatively affect…

Descriptors: Models, Test Items, Educational Assessment, Scores

Comparing and Combining IRTree Models and Anchoring Vignettes in Addressing Response Styles

Peer reviewed

Direct link

Mingfeng Xue; Ping Chen – Journal of Educational Measurement, 2025

Response styles pose great threats to psychological measurements. This research compares IRTree models and anchoring vignettes in addressing response styles and estimating the target traits. It also explores the potential of combining them at the item level and total-score level (ratios of extreme and middle responses to vignettes). Four models…

Descriptors: Item Response Theory, Models, Comparative Analysis, Vignettes

Reliability of the Commonly Used and Newly-Developed Autism Measures

Peer reviewed

Direct link

Thomas W. Frazier; Andrew J. O. Whitehouse; Susan R. Leekam; Sarah J. Carrington; Gail A. Alvares; David W. Evans; Antonio Y. Hardan; Mirko Uljarevic – Journal of Autism and Developmental Disorders, 2024

Purpose: The aim of the present study was to compare scale and conditional reliability derived from item response theory analyses among the most commonly used, as well as several newly developed, observation, interview, and parent-report autism instruments. Methods: When available, data sets were combined to facilitate large sample evaluation.…

Descriptors: Test Reliability, Item Response Theory, Autism Spectrum Disorders, Clinical Diagnosis

Exploring the Influence of Response Styles on Continuous Scale Assessments: Insights from a Novel Modeling Approach

Peer reviewed

Direct link

Hung-Yu Huang – Educational and Psychological Measurement, 2025

The use of discrete categorical formats to assess psychological traits has a long-standing tradition that is deeply embedded in item response theory models. The increasing prevalence and endorsement of computer- or web-based testing has led to greater focus on continuous response formats, which offer numerous advantages in both respondent…

Descriptors: Response Style (Tests), Psychological Characteristics, Item Response Theory, Test Reliability

Comparative Evaluation of C-Test Reliability Using Classical and Modern Psychometric Methods

Peer reviewed
PDF on ERIC

Download full text

Neda Kianinezhad; Mohsen Kianinezhad – Language Education & Assessment, 2025

This study presents a comparative analysis of classical reliability measures, including Cronbach's alpha, test-retest, and parallel forms reliability, alongside modern psychometric methods such as the Rasch model and Mokken scaling, to evaluate the reliability of C-tests in language proficiency assessment. Utilizing data from 150 participants…

Descriptors: Psychometrics, Test Reliability, Language Proficiency, Language Tests

Properties and Performance of the One-Parameter Log-Linear Cognitive Diagnosis Model

Peer reviewed

Direct link

Lientje Maas; Matthew J. Madison; Matthieu J. S. Brinkhuis – Grantee Submission, 2024

Diagnostic classification models (DCMs) are psychometric models that yield probabilistic classifications of respondents according to a set of discrete latent variables. The current study examines the recently introduced one-parameter log-linear cognitive diagnosis model (1-PLCDM), which has increased interpretability compared with general DCMs due…

Descriptors: Clinical Diagnosis, Classification, Models, Psychometrics

Estimating the Reliability of Skill Transitions in Longitudinal Diagnostic Classification Models

Peer reviewed

Direct link

Madeline A. Schellman; Matthew J. Madison – Grantee Submission, 2024

Diagnostic classification models (DCMs) have grown in popularity as stakeholders increasingly desire actionable information related to students' skill competencies. Longitudinal DCMs offer a psychometric framework for providing estimates of students' proficiency status transitions over time. For both cross-sectional and longitudinal DCMs, it is…

Descriptors: Diagnostic Tests, Classification, Models, Psychometrics

Validation of an Elicited Imitation Test as a Measure of Korean Language Proficiency

Peer reviewed

Direct link

Hojung Kim; Changkyung Song; Jiyoung Kim; Hyeyun Jeong; Jisoo Park – Language Testing in Asia, 2024

This study presents a modified version of the Korean Elicited Imitation (EI) test, designed to resemble natural spoken language, and validates its reliability as a measure of proficiency. The study assesses the correlation between average test scores and Test of Proficiency in Korean (TOPIK) levels, examining score distributions among beginner,…

Descriptors: Korean, Test Validity, Test Reliability, Imitation

Linking Errors Introduced by Rapid Guessing Responses When Employing Multigroup Concurrent IRT Scaling

Direct link

Jiayi Deng – ProQuest LLC, 2024

Test score comparability in international large-scale assessments (LSA) is of utmost importance in measuring the effectiveness of education systems and understanding the impact of education on economic growth. To effectively compare test scores on an international scale, score linking is widely used to convert raw scores from different linguistic…

Descriptors: Item Response Theory, Scoring Rubrics, Scoring, Error of Measurement

Exploring the Relationship between Motivation and Augmented Reality Presence Using the Augmented Reality Presence Scale (ARPS)

Peer reviewed

Direct link

Enrico Gandolfi; Richard E. Ferdig – Educational Technology Research and Development, 2025

Augmented Reality (AR) is increasingly being adopted in education to foster engagement and interest in a variety of subjects and content areas. However, there is a scarcity of instruments to measure the instructional impact of this innovation. This article addresses this gap in two unique ways. First, it presents validation results of the…

Descriptors: Simulated Environment, Measures (Individuals), Rating Scales, Item Response Theory

Is Effort Moderated Scoring Robust to Multidimensional Rapid Guessing?

Peer reviewed

Direct link

Joseph A. Rios; Jiayi Deng – Educational and Psychological Measurement, 2025

To mitigate the potential damaging consequences of rapid guessing (RG), a form of noneffortful responding, researchers have proposed a number of scoring approaches. The present simulation study examines the robustness of the most popular of these approaches, the unidimensional effort-moderated (EM) scoring procedure, to multidimensional RG (i.e.,…

Descriptors: Scoring, Guessing (Tests), Reaction Time, Item Response Theory

Optimizing Bayesian Knowledge Tracing with Neural Network Parameter Generation

Peer reviewed
PDF on ERIC

Download full text

Anirudhan Badrinath; Zachary Pardos – Journal of Educational Data Mining, 2025

Bayesian Knowledge Tracing (BKT) is a well-established model for formative assessment, with optimization typically using expectation maximization, conjugate gradient descent, or brute force search. However, one of the flaws of existing optimization techniques for BKT models is convergence to undesirable local minima that negatively impact…

Descriptors: Bayesian Statistics, Intelligent Tutoring Systems, Problem Solving, Audience Response Systems

Validation Study of the Korean Version of Decent Work Scale

Peer reviewed

Direct link

Lee, Yunsoo; Song, Ji Hoon; Kim, Soo Jung – European Journal of Training and Development, 2023

Purpose: This paper aims to validate the Korean version of the decent work scale and examine the relationship between decent work and work engagement. Design/methodology/approach: After completing translation and back translation, the authors surveyed 266 Korean employees from various organizations via network sampling. They assessed Rasch's model…

Descriptors: Test Validity, Measures (Individuals), Work Attitudes, Test Reliability

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 37

ProQuest LLC	22
Educational and Psychological…	20
Grantee Submission	19
Journal of Psychoeducational…	14
Online Submission	13
Journal of Educational…	11
SAGE Open	11
Applied Measurement in…	10
Measurement and Evaluation in…	9
ETS Research Report Series	8
Language Testing	8
Applied Psychological…	7
Cogent Education	7
Educational Measurement:…	7
International Journal of…	7
International Journal of…	7
Behavioral Research and…	6
European Journal of…	6
International Journal of…	6
EURASIA Journal of…	5
International Journal of…	5
Journal of Baltic Science…	5
Journal of Speech, Language,…	5
Measurement:…	5
New York State Education…	5
More ▼

Schoen, Robert C.	9
Petscher, Yaacov	7
Wang, Wen-Chung	7
Alonzo, Julie	6
Tindal, Gerald	6
Anderson, Daniel	4
Bauduin, Charity	4
Biancarosa, Gina	4
Carlson, Sarah E.	4
Davison, Mark L.	4
Irvin, P. Shawn	4
Lai, Cheng-Fei	4
Liu, Bowen	4
Park, Bitnara Jasmine	4
Seipel, Ben	4
Baghaei, Purya	3
Boone, William J.	3
Crawford, Angela R.	3
Istiyono, Edi	3
Johnson, Evelyn S.	3
Kuo, Bor-Chen	3
Lee, Yi-Hsuan	3
Moylan, Laura A.	3
Nicewander, W. Alan	3
More ▼

SAT (College Admission Test)	8
Graduate Record Examinations	5
Trends in International…	5
Iowa Tests of Basic Skills	4
Early Childhood Longitudinal…	3
Motivated Strategies for…	3
National Assessment of…	3
Program for International…	3
Test of English as a Foreign…	3
ACT Assessment	2
Gates MacGinitie Reading Tests	2
Measures of Academic Progress	2
Peabody Picture Vocabulary…	2
Raven Progressive Matrices	2
Remote Associates Test	2
Stanford Achievement Tests	2
Student Teacher Relationship…	2
Wechsler Adult Intelligence…	2
Armed Forces Qualification…	1
Autism Diagnostic Observation…	1
Behavior Assessment System…	1
Bruininks Oseretsky Test of…	1
California Achievement Tests	1
Center for Epidemiologic…	1
Child Behavior Checklist	1
More ▼