ERIC - Search Results

Publication Date

In 2025	30
Since 2024	67
Since 2021 (last 5 years)	233
Since 2016 (last 10 years)	507
Since 2006 (last 20 years)	728

Descriptor

Test Items	1161
Test Reliability	1161
Test Validity	663
Test Construction	555
Foreign Countries	335
Difficulty Level	271
Item Analysis	245
Psychometrics	226
Item Response Theory	214
Factor Analysis	176
Scores	169
Multiple Choice Tests	165
Higher Education	124
Correlation	120
Test Format	116
Scoring	111
Statistical Analysis	108
Measures (Individuals)	98
Goodness of Fit	90
Achievement Tests	89
Comparative Analysis	89
College Students	88
Factor Structure	88
Undergraduate Students	88
Test Bias	84
More ▼

Education Level

Higher Education	238
Postsecondary Education	193
Secondary Education	142
Elementary Education	118
Middle Schools	66
High Schools	65
Junior High Schools	49
Early Childhood Education	37
Elementary Secondary Education	35
Intermediate Grades	30
Primary Education	29
Grade 8	22
Grade 7	20
Grade 6	17
Kindergarten	17
Grade 5	16
Grade 2	15
Grade 1	13
Grade 3	13
Grade 4	13
Grade 9	9
Adult Education	7
Preschool Education	6
Grade 10	3
Grade 12	3
More ▼

Audience

Practitioners	39
Researchers	30
Teachers	24
Administrators	13
Support Staff	3
Counselors	2
Students	2
Community	1
Parents	1
Policymakers	1

Location

Turkey	68
Indonesia	30
Germany	19
Canada	17
Florida	17
China	16
Australia	15
California	12
Iran	11
India	10
New York	9
United States	9
Malaysia	8
Nigeria	8
Taiwan	8
Netherlands	7
Georgia	6
Illinois	6
Mexico	6
Nebraska	6
Saudi Arabia	6
South Korea	6
Thailand	6
Turkey (Ankara)	6
Turkey (Istanbul)	6
More ▼

Laws, Policies, & Programs

Individuals with Disabilities…	4
Every Student Succeeds Act…	3
No Child Left Behind Act 2001	3
Rehabilitation Act 1973…	3
Elementary and Secondary…	1
Head Start	1
Job Training Partnership Act…	1
United Nations Convention on…	1

What Works Clearinghouse Rating

Meets WWC Standards without Reservations	1
Meets WWC Standards with or without Reservations	1

Showing 1 to 15 of 1,161 results Save | Export

Another Look at Yen's Q3: Is 0.2 an Appropriate Cut-Off?

Peer reviewed

Direct link

Kelsey Nason; Christine DeMars – Journal of Educational Measurement, 2025

This study examined the widely used threshold of 0.2 for Yen's Q3, an index for violations of local independence. Specifically, a simulation was conducted to investigate whether Q3 values were related to the magnitude of bias in estimates of reliability, item parameters, and examinee ability. Results showed that Q3 values below the typical cut-off…

Descriptors: Item Response Theory, Statistical Bias, Test Reliability, Test Items

Examining the Wording Effect: What Are We Measuring?

Peer reviewed

Direct link

Abdullah Faruk Kiliç; Meltem Acar Güvendir; Gül Güler; Tugay Kaçak – Measurement: Interdisciplinary Research and Perspectives, 2025

In this study, the extent to wording effects impact structure and factor loadings, internal consistency and measurement invariance was outlined. The modified form, which includes items that semantically reversed, explains %21.5 more variance than the original form. Also, reversed items' factor loadings are higher. As a result of CFA, indexes…

Descriptors: Test Items, Factor Structure, Test Reliability, Semantics

Seeking the Real Reliability: Why the Traditional Estimators of Reliability Usually Fail in Achievement Testing and Why the Deflation-Corrected Coefficients Could Be Better Options

Peer reviewed
PDF on ERIC

Download full text

Metsämuuronen, Jari – Practical Assessment, Research & Evaluation, 2023

Traditional estimators of reliability such as coefficients alpha, theta, omega, and rho (maximal reliability) are prone to give radical underestimates of reliability for the tests common when testing educational achievement. These tests are often structured by widely deviating item difficulties. This is a typical pattern where the traditional…

Descriptors: Test Reliability, Achievement Tests, Computation, Test Items

Modeling Directional Testlet Effects on Multiple Open-Ended Questions

Peer reviewed

Direct link

Kuan-Yu Jin; Wai-Lok Siu – Journal of Educational Measurement, 2025

Educational tests often have a cluster of items linked by a common stimulus ("testlet"). In such a design, the dependencies caused between items are called "testlet effects." In particular, the directional testlet effect (DTE) refers to a recursive influence whereby responses to earlier items can positively or negatively affect…

Descriptors: Models, Test Items, Educational Assessment, Scores

Comparing and Combining IRTree Models and Anchoring Vignettes in Addressing Response Styles

Peer reviewed

Direct link

Mingfeng Xue; Ping Chen – Journal of Educational Measurement, 2025

Response styles pose great threats to psychological measurements. This research compares IRTree models and anchoring vignettes in addressing response styles and estimating the target traits. It also explores the potential of combining them at the item level and total-score level (ratios of extreme and middle responses to vignettes). Four models…

Descriptors: Item Response Theory, Models, Comparative Analysis, Vignettes

Estimating the Psychometric Properties ("Item Difficulty, Discrimination and Reliability Indices") of Test Items Using Kuder-Richardson Approach (KR-20)

Peer reviewed
PDF on ERIC

Download full text

Ntumi, Simon; Agbenyo, Sheilla; Bulala, Tapela – Shanlax International Journal of Education, 2023

There is no need or point to testing of knowledge, attributes, traits, behaviours or abilities of an individual if information obtained from the test is inaccurate. However, by and large, it seems the estimation of psychometric properties of test items in classroomshas been completely ignored otherwise dying slowly in most testing environments. In…

Descriptors: Psychometrics, Accuracy, Test Validity, Factor Analysis

Developing a Spatial Thinking Skills Test in Geography Teaching

Peer reviewed
PDF on ERIC

Download full text

Atakan Yalcin; Cennet Sanli; Adnan Pinar – Journal of Theoretical Educational Science, 2025

This study aimed to develop a test to measure university students' spatial thinking skills. The research was conducted using a survey design, with a sample of 260 undergraduate students from geography teaching and geography departments. GIS software was used to incorporate maps and satellite images, enhancing the spatial representation in the…

Descriptors: Spatial Ability, Thinking Skills, Geography, Undergraduate Students

A Comparison of Yen's Q3 Coefficient and Rasch Testlet Modeling for Identifying Local Item Dependence: Evidence from Two Vocabulary Matching Tests

Peer reviewed

Direct link

Hung Tan Ha; Duyen Thi Bich Nguyen; Tim Stoeckel – Language Assessment Quarterly, 2025

This article compares two methods for detecting local item dependence (LID): residual correlation examination and Rasch testlet modeling (RTM), in a commonly used 3:6 matching format and an extended matching test (EMT) format. The two formats are hypothesized to facilitate different levels of item dependency due to differences in the number of…

Descriptors: Comparative Analysis, Language Tests, Test Items, Item Analysis

Comparative Evaluation of C-Test Reliability Using Classical and Modern Psychometric Methods

Peer reviewed
PDF on ERIC

Download full text

Neda Kianinezhad; Mohsen Kianinezhad – Language Education & Assessment, 2025

This study presents a comparative analysis of classical reliability measures, including Cronbach's alpha, test-retest, and parallel forms reliability, alongside modern psychometric methods such as the Rasch model and Mokken scaling, to evaluate the reliability of C-tests in language proficiency assessment. Utilizing data from 150 participants…

Descriptors: Psychometrics, Test Reliability, Language Proficiency, Language Tests

The Effects of Reverse Items on Psychometric Properties and Respondents' Scale Scores According to Different Item Reversal Strategies

Peer reviewed
PDF on ERIC

Download full text

Mustafa Ilhan; Nese Güler; Gülsen Tasdelen Teker; Ömer Ergenekon – International Journal of Assessment Tools in Education, 2024

This study aimed to examine the effects of reverse items created with different strategies on psychometric properties and respondents' scale scores. To this end, three versions of a 10-item scale in the research were developed: 10 positive items were integrated in the first form (Form-P) and five positive and five reverse items in the other two…

Descriptors: Test Items, Psychometrics, Scores, Measures (Individuals)

Utilizing Real-Time Test Data to Solve Attenuation Paradox in Computerized Adaptive Testing to Enhance Optimal Design

Peer reviewed

Direct link

Jyun-Hong Chen; Hsiu-Yi Chao – Journal of Educational and Behavioral Statistics, 2024

To solve the attenuation paradox in computerized adaptive testing (CAT), this study proposes an item selection method, the integer programming approach based on real-time test data (IPRD), to improve test efficiency. The IPRD method turns information regarding the ability distribution of the population from real-time test data into feasible test…

Descriptors: Data Use, Computer Assisted Testing, Adaptive Testing, Design

Validation of an Elicited Imitation Test as a Measure of Korean Language Proficiency

Peer reviewed

Direct link

Hojung Kim; Changkyung Song; Jiyoung Kim; Hyeyun Jeong; Jisoo Park – Language Testing in Asia, 2024

This study presents a modified version of the Korean Elicited Imitation (EI) test, designed to resemble natural spoken language, and validates its reliability as a measure of proficiency. The study assesses the correlation between average test scores and Test of Proficiency in Korean (TOPIK) levels, examining score distributions among beginner,…

Descriptors: Korean, Test Validity, Test Reliability, Imitation

Is Effort Moderated Scoring Robust to Multidimensional Rapid Guessing?

Peer reviewed

Direct link

Joseph A. Rios; Jiayi Deng – Educational and Psychological Measurement, 2025

To mitigate the potential damaging consequences of rapid guessing (RG), a form of noneffortful responding, researchers have proposed a number of scoring approaches. The present simulation study examines the robustness of the most popular of these approaches, the unidimensional effort-moderated (EM) scoring procedure, to multidimensional RG (i.e.,…

Descriptors: Scoring, Guessing (Tests), Reaction Time, Item Response Theory

Improvised Progressive Model Based on Automatic Calibration of Difficulty Level: A Practical Solution of Competitive-Based Examination

Peer reviewed

Direct link

Aditya Shah; Ajay Devmane; Mehul Ranka; Prathamesh Churi – Education and Information Technologies, 2024

Online learning has grown due to the advancement of technology and flexibility. Online examinations measure students' knowledge and skills. Traditional question papers include inconsistent difficulty levels, arbitrary question allocations, and poor grading. The suggested model calibrates question paper difficulty based on student performance to…

Descriptors: Computer Assisted Testing, Difficulty Level, Grading, Test Construction

We Before Me: Developing a Self-Referent Measure of Cultural Humility for Postsecondary Students

Peer reviewed

Direct link

Melissa Whatley; Dominique Foster; Stephen Paul – Journal of Studies in International Education, 2024

The purpose of this study was to develop a measurement instrument that scholars and practitioners in international education can use as a means of exploring whether and how individuals who come into contact with international education programs develop a greater sense of cultural humility. Specifically, the study described here outlines the four…

Descriptors: Foreign Students, Cultural Awareness, Consciousness Raising, Test Construction

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 78

Educational and Psychological…	54
Journal of Psychoeducational…	31
Online Submission	31
Journal of Educational…	26
ProQuest LLC	24
Grantee Submission	18
ETS Research Report Series	16
International Journal of…	16
SAGE Open	14
Applied Psychological…	13
Applied Measurement in…	11
Language Testing	10
International Journal of…	9
International Journal of…	9
Journal of Experimental…	9
Chemistry Education Research…	8
Language Assessment Quarterly	8
Physical Review Physics…	8
Education and Information…	7
Educational Sciences: Theory…	7
Measurement and Evaluation in…	7
Practical Assessment,…	7
Psychometrika	7
Eurasian Journal of…	6
Language Testing in Asia	6
More ▼

Schoen, Robert C.	12
LaVenia, Mark	5
Liu, Ou Lydia	5
Anderson, Daniel	4
Bauduin, Charity	4
DiLuzio, Geneva J.	4
Farina, Kristy	4
Haladyna, Thomas M.	4
Huck, Schuyler W.	4
Petscher, Yaacov	4
Stansfield, Charles W.	4
Trevisan, Michael S.	4
Wainer, Howard	4
Yang, Xiaotong	4
Aiken, Lewis R.	3
Alonzo, Julie	3
Baghaei, Purya	3
Benson, Jeri	3
Boone, William J.	3
Brennan, Robert L.	3
Burton, Richard F.	3
Dogan, Nuri	3
Downing, Steven M.	3
Edwards, Michael C.	3
More ▼

Reports - Research	836
Journal Articles	789
Reports - Evaluative	144
Speeches/Meeting Papers	126
Tests/Questionnaires	103
Reports - Descriptive	60
Guides - Non-Classroom	28
Dissertations/Theses -…	24
Numerical/Quantitative Data	23
Opinion Papers	19
Information Analyses	18
Guides - Classroom - Teacher	8
Books	5
Guides - General	5
Book/Product Reviews	2
Collected Works - General	2
ERIC Digests in Full Text	2
ERIC Publications	2
Guides - Classroom - Learner	2
Reference Materials -…	2
Collected Works - Serials	1
Computer Programs	1
Multilingual/Bilingual…	1
Non-Print Media	1
Reference Materials -…	1
More ▼

SAT (College Admission Test)	10
Test of English as a Foreign…	9
ACT Assessment	6
Graduate Record Examinations	5
Trends in International…	5
Wechsler Intelligence Scale…	5
Raven Progressive Matrices	4
Test of English for…	4
Comprehensive Tests of Basic…	3
Dynamic Indicators of Basic…	3
Marlowe Crowne Social…	3
Measures of Academic Progress	3
Peabody Picture Vocabulary…	3
Stanford Achievement Tests	3
Advanced Placement…	2
Armed Services Vocational…	2
Autism Diagnostic Observation…	2
Child Behavior Checklist	2
Flesch Kincaid Grade Level…	2
Graduate Management Admission…	2
International English…	2
Iowa Tests of Basic Skills	2
National Assessment of…	2
Progress in International…	2
Rosenberg Self Esteem Scale	2
More ▼