ERIC - Search Results

Publication Date

In 2025	5
Since 2024	23
Since 2021 (last 5 years)	99
Since 2016 (last 10 years)	293
Since 2006 (last 20 years)	543

Descriptor

Correlation	691
Test Items	691
Foreign Countries	209
Scores	167
Item Analysis	152
Difficulty Level	147
Test Construction	140
Item Response Theory	139
Statistical Analysis	127
Factor Analysis	126
Test Reliability	120
Test Validity	118
Comparative Analysis	117
Psychometrics	85
Language Tests	79
Second Language Learning	75
Multiple Choice Tests	74
English (Second Language)	70
Undergraduate Students	68
Mathematics Tests	66
Test Format	63
College Students	62
Models	62
Test Bias	61
Simulation	56
More ▼

Publication Type

Reports - Research	541
Journal Articles	507
Reports - Evaluative	77
Speeches/Meeting Papers	66
Tests/Questionnaires	43
Dissertations/Theses -…	32
Reports - Descriptive	18
Numerical/Quantitative Data	10
Information Analyses	8
Opinion Papers	6
Non-Print Media	3
Reference Materials - General	3
Books	1
Collected Works - General	1
Collected Works - Proceedings	1
Guides - General	1
Guides - Non-Classroom	1
Multilingual/Bilingual…	1
More ▼

Education Level

Higher Education	190
Postsecondary Education	150
Secondary Education	80
Elementary Education	66
High Schools	39
Middle Schools	34
Junior High Schools	25
Intermediate Grades	22
Early Childhood Education	19
Grade 8	19
Elementary Secondary Education	17
Grade 4	15
Grade 5	15
Primary Education	15
Grade 3	10
Grade 7	9
Grade 9	9
Grade 6	8
Grade 2	7
Kindergarten	6
Grade 1	5
Grade 10	4
Grade 11	3
Grade 12	3
Preschool Education	3
More ▼

Audience

Researchers	17
Practitioners	3
Teachers	3
Students	1

Location

Turkey	34
Canada	15
Australia	12
Taiwan	11
United States	11
Germany	10
China	9
Japan	9
South Korea	9
Netherlands	8
United Kingdom	8
California	7
New York	6
United Kingdom (England)	6
Iran	5
Malaysia	5
Saudi Arabia	5
South Africa	5
Switzerland	5
Thailand	5
Europe	4
Florida	4
Hong Kong	4
Illinois	4
India	4
More ▼

Laws, Policies, & Programs

Individuals with Disabilities…	1
No Child Left Behind Act 2001	1
United Nations Convention on…	1

What Works Clearinghouse Rating

Meets WWC Standards without Reservations	1
Meets WWC Standards with or without Reservations	1

Showing 1 to 15 of 691 results Save | Export

Detecting Differential Item Functioning with Multiple Causes: A Comparison of Three Methods

Peer reviewed

Direct link

Xiaowen Liu – International Journal of Testing, 2024

Differential item functioning (DIF) often arises from multiple sources. Within the context of multidimensional item response theory, this study examined DIF items with varying secondary dimensions using the three DIF methods: SIBTEST, Mantel-Haenszel, and logistic regression. The effect of the number of secondary dimensions on DIF detection rates…

Descriptors: Item Analysis, Test Items, Item Response Theory, Correlation

A Comparison of Yen's Q3 Coefficient and Rasch Testlet Modeling for Identifying Local Item Dependence: Evidence from Two Vocabulary Matching Tests

Peer reviewed

Direct link

Hung Tan Ha; Duyen Thi Bich Nguyen; Tim Stoeckel – Language Assessment Quarterly, 2025

This article compares two methods for detecting local item dependence (LID): residual correlation examination and Rasch testlet modeling (RTM), in a commonly used 3:6 matching format and an extended matching test (EMT) format. The two formats are hypothesized to facilitate different levels of item dependency due to differences in the number of…

Descriptors: Comparative Analysis, Language Tests, Test Items, Item Analysis

Leveraging LLM Respondents for Item Evaluation: A Psychometric Analysis

Peer reviewed

Direct link

Yunting Liu; Shreya Bhandari; Zachary A. Pardos – British Journal of Educational Technology, 2025

Effective educational measurement relies heavily on the curation of well-designed item pools. However, item calibration is time consuming and costly, requiring a sufficient number of respondents to estimate the psychometric properties of items. In this study, we explore the potential of six different large language models (LLMs; GPT-3.5, GPT-4,…

Descriptors: Artificial Intelligence, Test Items, Psychometrics, Educational Assessment

Effects of the Quantity and Magnitude of Cross-Loading and Model Specification on MIRT Item Parameter Recovery

Peer reviewed

Direct link

Mostafa Hosseinzadeh; Ki Lynn Matlock Cole – Educational and Psychological Measurement, 2024

In real-world situations, multidimensional data may appear on large-scale tests or psychological surveys. The purpose of this study was to investigate the effects of the quantity and magnitude of cross-loadings and model specification on item parameter recovery in multidimensional Item Response Theory (MIRT) models, especially when the model was…

Descriptors: Item Response Theory, Models, Maximum Likelihood Statistics, Algorithms

Online Calibration in Multidimensional Computerized Adaptive Testing with Polytomously Scored Items

Peer reviewed

Direct link

Yuan, Lu; Huang, Yingshi; Li, Shuhang; Chen, Ping – Journal of Educational Measurement, 2023

Online calibration is a key technology for item calibration in computerized adaptive testing (CAT) and has been widely used in various forms of CAT, including unidimensional CAT, multidimensional CAT (MCAT), CAT with polytomously scored items, and cognitive diagnostic CAT. However, as multidimensional and polytomous assessment data become more…

Descriptors: Computer Assisted Testing, Adaptive Testing, Computation, Test Items

How to Obtain the Most Error-Free Estimate of Reliability? Eight Sources of Deflation in the Estimates of Reliability to Avoid

Peer reviewed
PDF on ERIC

Download full text

Metsämuuronen, Jari – Practical Assessment, Research & Evaluation, 2022

The reliability of a test score is usually underestimated and the deflation may be profound, 0.40 - 0.60 units of reliability or 46 - 71%. Eight root sources of the deflation are discussed and quantified by a simulation with 1,440 real-world datasets: (1) errors in the measurement modelling, (2) inefficiency in the estimator of reliability within…

Descriptors: Test Reliability, Scores, Test Items, Correlation

On the Positive Correlation between DIF and Difficulty: A New Theory on the Correlation as Methodological Artifact

Peer reviewed

Direct link

Bolt, Daniel M.; Liao, Xiangyi – Journal of Educational Measurement, 2021

We revisit the empirically observed positive correlation between DIF and difficulty studied by Freedle and commonly seen in tests of verbal proficiency when comparing populations of different mean latent proficiency levels. It is shown that a positive correlation between DIF and difficulty estimates is actually an expected result (absent any true…

Descriptors: Test Bias, Difficulty Level, Correlation, Verbal Tests

Examining the Influence of Passage and Student and Characteristics on Test-Taking Strategies: An Eye-Tracking Study

Peer reviewed
PDF on ERIC

Download full text

Direct link

Scott P. Ardoin; Katherine S. Binder; Paulina A. Kulesz; Eloise Nimocks; Joshua A. Mellott – Grantee Submission, 2024

Understanding test-taking strategies (TTSs) and the variables that influence TTSs is crucial to understanding what reading comprehension tests measure. We examined how passage and student characteristics were associated with TTSs and their impact on response accuracy. Third (n = 78), fifth (n = 86), and eighth (n = 86) graders read and answered…

Descriptors: Test Wiseness, Eye Movements, Reading Comprehension, Reading Tests

Goodman-Kruskal Gamma and Dimension-Corrected Gamma in Educational Measurement Settings

Peer reviewed
PDF on ERIC

Download full text

Metsämuuronen, Jari – International Journal of Educational Methodology, 2021

Although Goodman-Kruskal gamma (G) is used relatively rarely it has promising potential as a coefficient of association in educational settings. Characteristics of G are studied in three sub-studies related to educational measurement settings. G appears to be unexpectedly appealing as an estimator of association between an item and a score because…

Descriptors: Educational Assessment, Measurement, Item Analysis, Correlation

Assessing Dimensionality of IRT Models Using Traditional and Revised Parallel Analyses

Peer reviewed

Direct link

Guo, Wenjing; Choi, Youn-Jeng – Educational and Psychological Measurement, 2023

Determining the number of dimensions is extremely important in applying item response theory (IRT) models to data. Traditional and revised parallel analyses have been proposed within the factor analysis framework, and both have shown some promise in assessing dimensionality. However, their performance in the IRT framework has not been…

Descriptors: Item Response Theory, Evaluation Methods, Factor Analysis, Guidelines

There Are Many Greater Lower Bounds than Cronbach's [alpha]: A Monte Carlo Simulation Study

Peer reviewed

Direct link

Novak, Josip; Rebernjak, Blaž – Measurement: Interdisciplinary Research and Perspectives, 2023

A Monte Carlo simulation study was conducted to examine the performance of [alpha], [lambda]2, [lambda][subscript 4], [lambda][subscript 2], [omega][subscript T], GLB[subscript MRFA], and GLB[subscript Algebraic] coefficients. Population reliability, distribution shape, sample size, test length, and number of response categories were varied…

Descriptors: Monte Carlo Methods, Evaluation Methods, Reliability, Simulation

Evaluating ChatGPT as a Self-Learning Tool in Medical Biochemistry: A Performance Assessment in Undergraduate Medical University Examination

Peer reviewed

Direct link

Krishna Mohan Surapaneni; Anusha Rajajagadeesan; Lakshmi Goudhaman; Shalini Lakshmanan; Saranya Sundaramoorthi; Dineshkumar Ravi; Kalaiselvi Rajendiran; Porchelvan Swaminathan – Biochemistry and Molecular Biology Education, 2024

The emergence of ChatGPT as one of the most advanced chatbots and its ability to generate diverse data has given room for numerous discussions worldwide regarding its utility, particularly in advancing medical education and research. This study seeks to assess the performance of ChatGPT in medical biochemistry to evaluate its potential as an…

Descriptors: Biochemistry, Science Instruction, Artificial Intelligence, Teaching Methods

To What Extent Are Item Discrimination Values Realistic? A New Index for Two-Dimensional Structures

Peer reviewed
PDF on ERIC

Download full text

Kilic, Abdullah Faruk; Uysal, Ibrahim – International Journal of Assessment Tools in Education, 2022

Most researchers investigate the corrected item-total correlation of items when analyzing item discrimination in multi-dimensional structures under the Classical Test Theory, which might lead to underestimating item discrimination, thereby removing items from the test. Researchers might investigate the corrected item-total correlation with the…

Descriptors: Item Analysis, Correlation, Item Response Theory, Test Items

Relative Robustness of CDMs and (M)IRT in Measuring Growth in Latent Skills

Peer reviewed

Direct link

Huang, Qi; Bolt, Daniel M. – Educational and Psychological Measurement, 2023

Previous studies have demonstrated evidence of latent skill continuity even in tests intentionally designed for measurement of binary skills. In addition, the assumption of binary skills when continuity is present has been shown to potentially create a lack of invariance in item and latent ability parameters that may undermine applications. In…

Descriptors: Item Response Theory, Test Items, Skill Development, Robustness (Statistics)

Development of the New Media Literacy Scale for EFL Learners in China: A Validation Study

Peer reviewed

Direct link

Luan, Lin; Liang, Jyh-Chong; Chai, Ching Sing; Lin, Tzu-Bin; Dong, Yan – Interactive Learning Environments, 2023

The emergence of new media technologies has empowered individuals to not merely consume but also create, share and critique media contents. Such activities are dependent on new media literacy (NML) necessary for living and working in the participatory culture of the twenty-first century. Although a burgeoning body of research has focused on the…

Descriptors: Foreign Countries, Media Literacy, Test Construction, English (Second Language)

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 47

Educational and Psychological…	49
ProQuest LLC	32
ETS Research Report Series	21
Journal of Educational…	20
International Journal of…	15
Grantee Submission	14
Language Testing	14
Online Submission	13
Journal of Psychoeducational…	10
Applied Measurement in…	8
Applied Psychological…	8
Journal of Educational and…	8
Educational Research and…	7
Assessment & Evaluation in…	6
College Board	6
Educational Sciences: Theory…	6
Eurasian Journal of…	6
International Journal of…	6
International Journal of…	6
International Journal of…	6
Journal of Experimental…	6
Multivariate Behavioral…	6
Structural Equation Modeling:…	6
Language Assessment Quarterly	5
Psychometrika	5
More ▼

Liu, Ou Lydia	7
Sinharay, Sandip	6
Stricker, Lawrence J.	5
Dorans, Neil J.	4
Kobrin, Jennifer L.	4
Metsämuuronen, Jari	4
Raykov, Tenko	4
Reckase, Mark D.	4
Soland, James	4
Aryadoust, Vahid	3
Attali, Yigal	3
Bolt, Daniel M.	3
Bulut, Okan	3
Cromley, Jennifer G.	3
DeMars, Christine E.	3
Farina, Kristy	3
Gierl, Mark J.	3
Holland, Paul	3
LaVenia, Mark	3
Leighton, Jacqueline P.	3
Lowrie, Tom	3
Marcoulides, George A.	3
Marsh, Herbert W.	3
Nandakumar, Ratna	3
More ▼

SAT (College Admission Test)	24
National Assessment of…	13
Program for International…	11
Graduate Record Examinations	9
Trends in International…	8
Test of English as a Foreign…	7
Wechsler Intelligence Scale…	6
Peabody Picture Vocabulary…	5
Raven Progressive Matrices	5
ACT Assessment	4
Minnesota Multiphasic…	4
Progress in International…	4
Test of English for…	4
Advanced Placement…	3
Wechsler Adult Intelligence…	3
Armed Services Vocational…	2
Digit Span Test	2
Graduate Management Admission…	2
Iowa Tests of Basic Skills	2
Rosenberg Self Esteem Scale	2
Stanford Achievement Tests	2
Torrance Tests of Creative…	2
UCLA Loneliness Scale	2
Wechsler Intelligence Scales…	2
Woodcock Johnson Tests of…	2
More ▼