ERIC - Search Results

Publication Date

In 2026	0
Since 2025	55
Since 2022 (last 5 years)	220
Since 2017 (last 10 years)	568
Since 2007 (last 20 years)	955

Descriptor

Test Items	1525
Test Reliability	1186
Test Validity	698
Test Construction	639
Foreign Countries	419
Difficulty Level	338
Item Analysis	309
Item Response Theory	281
Psychometrics	281
Scores	255
Reliability	246
Factor Analysis	222
Multiple Choice Tests	194
Correlation	189
Scoring	165
Statistical Analysis	156
Higher Education	151
Test Format	139
Comparative Analysis	136
Interrater Reliability	136
Measures (Individuals)	124
Goodness of Fit	117
College Students	111
Achievement Tests	109
Student Evaluation	106
More ▼

Education Level

Higher Education	296
Postsecondary Education	245
Secondary Education	177
Elementary Education	153
High Schools	83
Middle Schools	81
Junior High Schools	59
Elementary Secondary Education	50
Early Childhood Education	48
Intermediate Grades	37
Primary Education	35
Grade 8	30
Grade 5	23
Grade 6	22
Grade 7	22
Kindergarten	22
Grade 2	19
Grade 4	19
Grade 3	17
Grade 1	14
Grade 9	13
Preschool Education	8
Adult Education	7
Grade 10	5
Grade 12	5
More ▼

Audience

Practitioners	40
Researchers	38
Teachers	26
Administrators	13
Support Staff	3
Counselors	2
Students	2
Community	1
Parents	1
Policymakers	1

Location

Turkey	80
Indonesia	38
Germany	25
Australia	22
Canada	21
Florida	20
China	19
California	16
Taiwan	13
United States	13
India	12
Netherlands	12
Iran	11
Malaysia	11
New York	11
United Kingdom	11
Japan	8
Nigeria	8
South Korea	8
Georgia	7
Illinois	7
Jordan	7
Nebraska	7
Turkey (Ankara)	7
Turkey (Istanbul)	7
More ▼

Laws, Policies, & Programs

Individuals with Disabilities…	4
No Child Left Behind Act 2001	4
Every Student Succeeds Act…	3
Rehabilitation Act 1973…	3
Elementary and Secondary…	1
Head Start	1
Job Training Partnership Act…	1
United Nations Convention on…	1

What Works Clearinghouse Rating

Meets WWC Standards without Reservations	1
Meets WWC Standards with or without Reservations	1

Test Items X

Showing 1 to 15 of 1,525 results Save | Export

Another Look at Yen's Q3: Is 0.2 an Appropriate Cut-Off?

Peer reviewed

Direct link

Kelsey Nason; Christine DeMars – Journal of Educational Measurement, 2025

This study examined the widely used threshold of 0.2 for Yen's Q3, an index for violations of local independence. Specifically, a simulation was conducted to investigate whether Q3 values were related to the magnitude of bias in estimates of reliability, item parameters, and examinee ability. Results showed that Q3 values below the typical cut-off…

Descriptors: Item Response Theory, Statistical Bias, Test Reliability, Test Items

A Review of Automatic Item Generation Techniques Leveraging Large Language Models

Peer reviewed
PDF on ERIC

Download full text

Bin Tan; Nour Armoush; Elisabetta Mazzullo; Okan Bulut; Mark J. Gierl – International Journal of Assessment Tools in Education, 2025

This study reviews existing research on the use of large language models (LLMs) for automatic item generation (AIG). We performed a comprehensive literature search across seven research databases, selected studies based on predefined criteria, and summarized 60 relevant studies that employed LLMs in the AIG process. We identified the most commonly…

Descriptors: Artificial Intelligence, Test Items, Automation, Test Format

Automated Scoring in Learning Progression-Based Assessment: A Comparison of Researcher and Machine Interpretations

Peer reviewed

Direct link

Hui Jin; Cynthia Lima; Limin Wang – Educational Measurement: Issues and Practice, 2025

Although AI transformer models have demonstrated notable capability in automated scoring, it is difficult to examine how and why these models fall short in scoring some responses. This study investigated how transformer models' language processing and quantification processes can be leveraged to enhance the accuracy of automated scoring. Automated…

Descriptors: Automation, Scoring, Artificial Intelligence, Accuracy

Examining the Wording Effect: What Are We Measuring?

Peer reviewed

Direct link

Abdullah Faruk Kiliç; Meltem Acar Güvendir; Gül Güler; Tugay Kaçak – Measurement: Interdisciplinary Research and Perspectives, 2025

In this study, the extent to wording effects impact structure and factor loadings, internal consistency and measurement invariance was outlined. The modified form, which includes items that semantically reversed, explains %21.5 more variance than the original form. Also, reversed items' factor loadings are higher. As a result of CFA, indexes…

Descriptors: Test Items, Factor Structure, Test Reliability, Semantics

Ratings of Students' Stress: Initial Reliability and Validity Evidence for a Brief Stress and Resilience Assessment

Peer reviewed

Direct link

Christopher J. Anthony; Stephen N. Elliott – School Mental Health, 2025

Stress is a complex construct that is related to resilience and general health starting in childhood. Despite its importance for student health and well-being, there are few measures of stress designed for school-based applications. In this study, we developed and initially validated a Stress Indicators Scale using five samples of teachers,…

Descriptors: Test Construction, Stress Variables, Test Validity, Test Items

Establishing a Physics Concept Inventory Using Computer Marked Free-Response Questions

Peer reviewed
PDF on ERIC

Download full text

Parker, Mark A. J.; Hedgeland, Holly; Jordan, Sally E.; Braithwaite, Nicholas St. J. – European Journal of Science and Mathematics Education, 2023

The study covers the development and testing of the alternative mechanics survey (AMS), a modified force concept inventory (FCI), which used automatically marked free-response questions. Data were collected over a period of three academic years from 611 participants who were taking physics classes at high school and university level. A total of…

Descriptors: Test Construction, Scientific Concepts, Physics, Test Reliability

Seeking the Real Reliability: Why the Traditional Estimators of Reliability Usually Fail in Achievement Testing and Why the Deflation-Corrected Coefficients Could Be Better Options

Peer reviewed
PDF on ERIC

Download full text

Metsämuuronen, Jari – Practical Assessment, Research & Evaluation, 2023

Traditional estimators of reliability such as coefficients alpha, theta, omega, and rho (maximal reliability) are prone to give radical underestimates of reliability for the tests common when testing educational achievement. These tests are often structured by widely deviating item difficulties. This is a typical pattern where the traditional…

Descriptors: Test Reliability, Achievement Tests, Computation, Test Items

Modeling Directional Testlet Effects on Multiple Open-Ended Questions

Peer reviewed

Direct link

Kuan-Yu Jin; Wai-Lok Siu – Journal of Educational Measurement, 2025

Educational tests often have a cluster of items linked by a common stimulus ("testlet"). In such a design, the dependencies caused between items are called "testlet effects." In particular, the directional testlet effect (DTE) refers to a recursive influence whereby responses to earlier items can positively or negatively affect…

Descriptors: Models, Test Items, Educational Assessment, Scores

Comparing and Combining IRTree Models and Anchoring Vignettes in Addressing Response Styles

Peer reviewed

Direct link

Mingfeng Xue; Ping Chen – Journal of Educational Measurement, 2025

Response styles pose great threats to psychological measurements. This research compares IRTree models and anchoring vignettes in addressing response styles and estimating the target traits. It also explores the potential of combining them at the item level and total-score level (ratios of extreme and middle responses to vignettes). Four models…

Descriptors: Item Response Theory, Models, Comparative Analysis, Vignettes

Item-Level Heterogeneity in Value Added Models: Implications for Reliability, Cross-Study Comparability, and Effect Sizes. EdWorkingPaper No. 25-1173

Download full text

Joshua B. Gilbert; Zachary Himmelsbach; Luke W. Miratrix; Andrew D. Ho; Benjamin W. Domingue – Annenberg Institute for School Reform at Brown University, 2025

Value added models (VAMs) attempt to estimate the causal effects of teachers and schools on student test scores. We apply Generalizability Theory to show how estimated VA effects depend upon the selection of test items. Standard VAMs estimate causal effects on the items that are included on the test. Generalizability demands consideration of how…

Descriptors: Value Added Models, Reliability, Effect Size, Test Items

Estimating the Psychometric Properties ("Item Difficulty, Discrimination and Reliability Indices") of Test Items Using Kuder-Richardson Approach (KR-20)

Peer reviewed
PDF on ERIC

Download full text

Ntumi, Simon; Agbenyo, Sheilla; Bulala, Tapela – Shanlax International Journal of Education, 2023

There is no need or point to testing of knowledge, attributes, traits, behaviours or abilities of an individual if information obtained from the test is inaccurate. However, by and large, it seems the estimation of psychometric properties of test items in classroomshas been completely ignored otherwise dying slowly in most testing environments. In…

Descriptors: Psychometrics, Accuracy, Test Validity, Factor Analysis

Reconceptualization of Coefficient Alpha Reliability for Test Summed and Scaled Scores

Peer reviewed

Direct link

Almehrizi, Rashid S. – Educational Measurement: Issues and Practice, 2022

Coefficient alpha reliability persists as the most common reliability coefficient reported in research. The assumptions for its use are, however, not well-understood. The current paper challenges the commonly used expressions of coefficient alpha and argues that while these expressions are correct when estimating reliability for summed scores,…

Descriptors: Reliability, Scores, Scaling, Statistical Analysis

Developing a Spatial Thinking Skills Test in Geography Teaching

Peer reviewed
PDF on ERIC

Download full text

Atakan Yalcin; Cennet Sanli; Adnan Pinar – Journal of Theoretical Educational Science, 2025

This study aimed to develop a test to measure university students' spatial thinking skills. The research was conducted using a survey design, with a sample of 260 undergraduate students from geography teaching and geography departments. GIS software was used to incorporate maps and satellite images, enhancing the spatial representation in the…

Descriptors: Spatial Ability, Thinking Skills, Geography, Undergraduate Students

A Comparison of Yen's Q3 Coefficient and Rasch Testlet Modeling for Identifying Local Item Dependence: Evidence from Two Vocabulary Matching Tests

Peer reviewed

Direct link

Hung Tan Ha; Duyen Thi Bich Nguyen; Tim Stoeckel – Language Assessment Quarterly, 2025

This article compares two methods for detecting local item dependence (LID): residual correlation examination and Rasch testlet modeling (RTM), in a commonly used 3:6 matching format and an extended matching test (EMT) format. The two formats are hypothesized to facilitate different levels of item dependency due to differences in the number of…

Descriptors: Comparative Analysis, Language Tests, Test Items, Item Analysis

Factorial Invariance of the Academic Procrastination Scale (APS) in University Students from Ecuador, Venezuela, and Peru

Peer reviewed

Direct link

Rodrigo Moreta-Herrera; Jacqueline Regatto-Bonifaz; Víctor Viteri-Miranda; María Gorety Rodríguez-Vieira; Giancarlo Magro-Lazo; Jose A. Rodas; Sergio Dominguez-Lara – Journal of Psychoeducational Assessment, 2025

Objective: Analyze the evidence of validity of scores of the Academic Procrastination Scale (APS), its measurement equivalence based on nationality, its reliability of the scores, and its validity in relation to other variables in university students from Ecuador, Venezuela, and Peru. Method: This paper involves a quantitative, descriptive,…

Descriptors: Measures (Individuals), Time Management, College Students, Foreign Countries

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 102

Educational and Psychological…	75
Online Submission	39
Journal of Psychoeducational…	36
Journal of Educational…	34
ProQuest LLC	34
ETS Research Report Series	31
Applied Measurement in…	25
Applied Psychological…	22
Grantee Submission	21
International Journal of…	20
SAGE Open	14
Educational Measurement:…	13
International Journal of…	12
International Journal of…	12
International Journal of…	10
Journal of Educational and…	10
Journal of Experimental…	10
Language Testing	10
Physical Review Physics…	10
Psychometrika	10
Education and Information…	9
Educational Sciences: Theory…	9
Journal of Baltic Science…	9
Language Assessment Quarterly	9
Measurement and Evaluation in…	9
More ▼

Schoen, Robert C.	12
Anderson, Daniel	6
Guo, Hongwen	6
Liu, Ou Lydia	6
Reckase, Mark D.	6
Alonzo, Julie	5
Brennan, Robert L.	5
Downing, Steven M.	5
Frisbie, David A.	5
Haladyna, Thomas M.	5
LaVenia, Mark	5
Plake, Barbara S.	5
Wainer, Howard	5
Baghaei, Purya	4
Barnette, J. Jackson	4
Bauduin, Charity	4
Chang, Lei	4
DiLuzio, Geneva J.	4
Farina, Kristy	4
Feldt, Leonard S.	4
Herman, Joan L.	4
Huck, Schuyler W.	4
Impara, James C.	4
Lee, Guemin	4
More ▼

Reports - Research	1070
Journal Articles	1042
Reports - Evaluative	228
Speeches/Meeting Papers	184
Tests/Questionnaires	122
Reports - Descriptive	81
Dissertations/Theses -…	35
Numerical/Quantitative Data	34
Guides - Non-Classroom	30
Information Analyses	25
Opinion Papers	22
Guides - Classroom - Teacher	8
Books	7
Collected Works - General	5
Guides - General	5
ERIC Digests in Full Text	3
ERIC Publications	3
Reports - General	3
Book/Product Reviews	2
Collected Works - Serials	2
Guides - Classroom - Learner	2
Reference Materials -…	2
Computer Programs	1
Historical Materials	1
Multilingual/Bilingual…	1
More ▼

SAT (College Admission Test)	16
Test of English as a Foreign…	14
Program for International…	9
Trends in International…	9
ACT Assessment	7
Raven Progressive Matrices	6
Wechsler Intelligence Scale…	6
Graduate Record Examinations	5
National Assessment of…	5
Marlowe Crowne Social…	4
Peabody Picture Vocabulary…	4
Test of English for…	4
Advanced Placement…	3
Armed Services Vocational…	3
Comprehensive Tests of Basic…	3
Dynamic Indicators of Basic…	3
International English…	3
Iowa Tests of Basic Skills	3
Measures of Academic Progress	3
Progress in International…	3
Rosenberg Self Esteem Scale	3
Stanford Achievement Tests	3
Strengths and Difficulties…	3
Autism Diagnostic Observation…	2
Center for Epidemiologic…	2
More ▼