ERIC - Search Results

Publication Date

In 2025	7
Since 2024	12
Since 2021 (last 5 years)	60
Since 2016 (last 10 years)	126
Since 2006 (last 20 years)	169

Descriptor

Difficulty Level	270
Test Items	270
Test Reliability	270
Test Validity	129
Test Construction	112
Foreign Countries	96
Item Response Theory	70
Item Analysis	65
Multiple Choice Tests	65
Psychometrics	45
Scores	42
Higher Education	30
Science Tests	30
Achievement Tests	27
Comparative Analysis	27
Correlation	27
Test Format	27
Undergraduate Students	27
High School Students	26
Statistical Analysis	25
Mathematics Tests	24
Elementary School Students	22
Thinking Skills	22
Language Tests	21
Scientific Concepts	21
More ▼

Publication Type

Reports - Research	221
Journal Articles	178
Speeches/Meeting Papers	29
Reports - Evaluative	23
Tests/Questionnaires	20
Reports - Descriptive	9
Dissertations/Theses -…	7
Numerical/Quantitative Data	5
Guides - Non-Classroom	2
Opinion Papers	2
Collected Works - Serials	1
Computer Programs	1
Guides - General	1
Information Analyses	1
Reports - General	1
More ▼

Education Level

Higher Education	57
Postsecondary Education	51
Secondary Education	49
Elementary Education	39
High Schools	24
Middle Schools	22
Junior High Schools	15
Early Childhood Education	10
Intermediate Grades	10
Primary Education	10
Grade 7	9
Grade 8	8
Kindergarten	8
Elementary Secondary Education	7
Grade 1	7
Grade 6	6
Grade 2	5
Grade 5	4
Grade 9	4
Grade 10	2
Grade 12	2
Grade 3	2
Grade 4	2
Grade 11	1
Preschool Education	1
More ▼

Audience

Researchers	7
Practitioners	2
Teachers	2

Location

Indonesia	17
Turkey	16
Florida	7
Germany	7
Canada	4
Japan	4
Nigeria	4
South Korea	4
United States	4
Australia	3
China	3
India	3
New York	3
Turkey (Istanbul)	3
Colorado	2
Georgia	2
Idaho	2
Indiana	2
Iran	2
Jordan	2
Norway	2
Taiwan	2
Thailand	2
Turkey (Ankara)	2
Arizona	1
More ▼

Laws, Policies, & Programs

Elementary and Secondary…	1
No Child Left Behind Act 2001	1

What Works Clearinghouse Rating

Showing 1 to 15 of 270 results Save | Export

Seeking the Real Reliability: Why the Traditional Estimators of Reliability Usually Fail in Achievement Testing and Why the Deflation-Corrected Coefficients Could Be Better Options

Peer reviewed
PDF on ERIC

Download full text

Metsämuuronen, Jari – Practical Assessment, Research & Evaluation, 2023

Traditional estimators of reliability such as coefficients alpha, theta, omega, and rho (maximal reliability) are prone to give radical underestimates of reliability for the tests common when testing educational achievement. These tests are often structured by widely deviating item difficulties. This is a typical pattern where the traditional…

Descriptors: Test Reliability, Achievement Tests, Computation, Test Items

Comparative Evaluation of C-Test Reliability Using Classical and Modern Psychometric Methods

Peer reviewed
PDF on ERIC

Download full text

Neda Kianinezhad; Mohsen Kianinezhad – Language Education & Assessment, 2025

This study presents a comparative analysis of classical reliability measures, including Cronbach's alpha, test-retest, and parallel forms reliability, alongside modern psychometric methods such as the Rasch model and Mokken scaling, to evaluate the reliability of C-tests in language proficiency assessment. Utilizing data from 150 participants…

Descriptors: Psychometrics, Test Reliability, Language Proficiency, Language Tests

Validation of an Elicited Imitation Test as a Measure of Korean Language Proficiency

Peer reviewed

Direct link

Hojung Kim; Changkyung Song; Jiyoung Kim; Hyeyun Jeong; Jisoo Park – Language Testing in Asia, 2024

This study presents a modified version of the Korean Elicited Imitation (EI) test, designed to resemble natural spoken language, and validates its reliability as a measure of proficiency. The study assesses the correlation between average test scores and Test of Proficiency in Korean (TOPIK) levels, examining score distributions among beginner,…

Descriptors: Korean, Test Validity, Test Reliability, Imitation

Is Effort Moderated Scoring Robust to Multidimensional Rapid Guessing?

Peer reviewed

Direct link

Joseph A. Rios; Jiayi Deng – Educational and Psychological Measurement, 2025

To mitigate the potential damaging consequences of rapid guessing (RG), a form of noneffortful responding, researchers have proposed a number of scoring approaches. The present simulation study examines the robustness of the most popular of these approaches, the unidimensional effort-moderated (EM) scoring procedure, to multidimensional RG (i.e.,…

Descriptors: Scoring, Guessing (Tests), Reaction Time, Item Response Theory

Improvised Progressive Model Based on Automatic Calibration of Difficulty Level: A Practical Solution of Competitive-Based Examination

Peer reviewed

Direct link

Aditya Shah; Ajay Devmane; Mehul Ranka; Prathamesh Churi – Education and Information Technologies, 2024

Online learning has grown due to the advancement of technology and flexibility. Online examinations measure students' knowledge and skills. Traditional question papers include inconsistent difficulty levels, arbitrary question allocations, and poor grading. The suggested model calibrates question paper difficulty based on student performance to…

Descriptors: Computer Assisted Testing, Difficulty Level, Grading, Test Construction

Assessment of Item and Test Parameters: Cosine Similarity Approach

Peer reviewed
PDF on ERIC

Download full text

Chakrabartty, Satyendra Nath – International Journal of Psychology and Educational Studies, 2021

The paper proposes new measures of difficulty and discriminating values of binary items and test consisting of such items and find their relationships including estimation of test error variance and thereby the test reliability, as per definition using cosine similarities. The measures use entire data. Difficulty value of test and item is defined…

Descriptors: Test Items, Difficulty Level, Scores, Test Reliability

Assessing Lower-Secondary School Students' Critical Thinking Skills in Photosynthesis: A Rasch Model Approach

Peer reviewed
PDF on ERIC

Download full text

Suwita Suwita; Sulistyo Saputro; Sajidan Sajidan; Sutarno Sutarno – Journal of Baltic Science Education, 2024

The current study uses the Rasch Model to measure lower-secondary school students' critical thinking skills on photosynthesis topics. Critical thinking skills are considered essential in science education, but few valid and practical measurement instruments remain. The current study fills the gap by adapting the instrument from the Watson-Glaser…

Descriptors: Secondary School Students, Critical Thinking, Thinking Skills, Botany

A Novel Examination of None-of-the-Above as It Influences Examinee Item Responses

Direct link

Thompson, Kathryn N. – ProQuest LLC, 2023

It is imperative to collect validity evidence prior to interpreting and using test scores. During the process of collecting validity evidence, test developers should consider whether test scores are contaminated by sources of extraneous information. This is referred to as construct irrelevant variance, or the "degree to which test scores are…

Descriptors: Test Wiseness, Test Items, Item Response Theory, Scores

Developing and Validating a Biological System Thinking Test for Middle School Students

Peer reviewed

Direct link

Ruying Li; Gaofeng Li – International Journal of Science and Mathematics Education, 2025

Systems thinking (ST) is an essential competence for future life and biology learning. Appropriate assessment is critical for collecting sufficient information to develop ST in biology education. This research offers an ST framework based on a comprehensive understanding of biological systems, encompassing four skills across three complexity…

Descriptors: Test Construction, Test Validity, Science Tests, Cognitive Tests

Validity and Reliability Analysis of a Socioscientific Issues-Based Critical Thinking Self-Assessment Instrument Using the Rasch Model

Peer reviewed
PDF on ERIC

Download full text

Y. Yokhebed; Rexy Maulana Dwi Karmadi; Luvia Ranggi Nastiti – Journal of Biological Education Indonesia (Jurnal Pendidikan Biologi Indonesia), 2025

Although self-assessment in critical thinking is thought to help students recognise their strengths and weaknesses, the reliability and validity of the assessment tool is still questionable, so a more objective evaluation is needed. Objective of this investigation is to assess the self-assessment tools in evaluating students' critical thinking…

Descriptors: Self Evaluation (Individuals), Critical Thinking, Science and Society, Test Validity

Design, Development, and Evaluation of the Organic Chemistry Representational Competence Assessment (ORCA)

Peer reviewed

Direct link

Lyniesha Ward; Fridah Rotich; Jeffrey R. Raker; Regis Komperda; Sachin Nedungadi; Maia Popova – Chemistry Education Research and Practice, 2025

This paper describes the design and evaluation of the Organic chemistry Representational Competence Assessment (ORCA). Grounded in Kozma and Russell's representational competence framework, the ORCA measures the learner's ability to "interpret," "translate," and "use" six commonly used representations of molecular…

Descriptors: Organic Chemistry, Science Tests, Test Construction, Student Evaluation

Examining the Effect of Item Difficulty and Rater Leniency on Iranian Test Takers' Performance on WDCT and DSAT: A Comparative Study

Peer reviewed
PDF on ERIC

Download full text

Reza Shahi; Hamdollah Ravand; Golam Reza Rohani – International Journal of Language Testing, 2025

The current paper intends to exploit the Many Facet Rasch Model to investigate and compare the impact of situations (items) and raters on test takers' performance on the Written Discourse Completion Test (WDCT) and Discourse Self-Assessment Tests (DSAT). In this study, the participants were 110 English as a Foreign Language (EFL) students at…

Descriptors: Comparative Analysis, English (Second Language), Second Language Learning, Second Language Instruction

Somers' D as an Alternative for the Item-Test and Item-Rest Correlation Coefficients in the Educational Measurement Settings

Peer reviewed
PDF on ERIC

Download full text

Metsämuuronen, Jari – International Journal of Educational Methodology, 2020

Pearson product-moment correlation coefficient between item g and test score X, known as item-test or item-total correlation ("Rit"), and item-rest correlation ("Rir") are two of the most used classical estimators for item discrimination power (IDP). Both "Rit" and "Rir" underestimate IDP caused by the…

Descriptors: Correlation, Test Items, Scores, Difficulty Level

The Effect of Multiple-Choice Test Items' Difficulty Degree on the Reliability Coefficient and the Standard Error of Measurement Depending on the Item Response Theory (IRT)

Peer reviewed
PDF on ERIC

Download full text

Al-zboon, Habis Saad; Alrekebat, Amjad Farhan – International Journal of Higher Education, 2021

This study aims at identifying the effect of multiple-choice test items' difficulty degree on the reliability coefficient and the standard error of measurement depending on the item response theory IRT. To achieve the objectives of the study, (WinGen3) software was used to generate the IRT parameters (difficulty, discrimination, guessing) for four…

Descriptors: Multiple Choice Tests, Test Items, Difficulty Level, Error of Measurement

Comparison of G and Phi Coefficients Estimated in Generalizability Theory with Real Cases

Peer reviewed
PDF on ERIC

Download full text

Deniz, Kaan Zulfikar; Ilican, Emel – International Journal of Assessment Tools in Education, 2021

This study aims to compare the G and Phi coefficients as estimated by D studies for a measurement tool with the G and Phi coefficients obtained from real cases in which items of differing difficulty levels were added and also to determine the conditions under which the D studies estimated reliability coefficients closer to reality. The study group…

Descriptors: Generalizability Theory, Test Items, Difficulty Level, Test Reliability

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 18

Educational and Psychological…	10
Online Submission	9
Journal of Educational…	8
ProQuest LLC	7
Grantee Submission	5
Journal of Experimental…	5
Applied Measurement in…	4
International Journal of…	4
SAGE Open	4
Applied Psychological…	3
Chemistry Education Research…	3
ETS Research Report Series	3
International Journal of…	3
International Journal of…	3
International Journal of…	3
International Journal of…	3
Journal of Turkish Science…	3
Physical Review Physics…	3
Practical Assessment,…	3
Advances in Health Sciences…	2
Assessment for Effective…	2
Behavioral Research and…	2
CBE - Life Sciences Education	2
Cypriot Journal of…	2
Educational Assessment	2
More ▼

Schoen, Robert C.	6
DiLuzio, Geneva J.	4
Yang, Xiaotong	4
Anderson, Daniel	3
Huck, Schuyler W.	3
Paek, Insu	3
Thompson, Bruce	3
Weiten, Wayne	3
Alexander, Patricia A.	2
Alonzo, Julie	2
Bauduin, Charity	2
Cliff, Norman	2
Feldt, Leonard S.	2
Frisbie, David A.	2
Henning, Grant	2
Istiyono, Edi	2
Lee, Young-Sun	2
Liu, Sicong	2
Loyd, Brenda H.	2
Metsämuuronen, Jari	2
Mike Stieff	2
Perez, Kathryn E.	2
Petscher, Yaacov	2
Pollock, Steven J.	2
More ▼

Test of English as a Foreign…	3
Flesch Kincaid Grade Level…	2
Raven Progressive Matrices	2
SAT (College Admission Test)	2
Test of English for…	2
Adult Attachment Interview	1
Armed Services Vocational…	1
Bayley Scales of Infant…	1
Child Behavior Checklist	1
Comprehensive Tests of Basic…	1
Defining Issues Test	1
Dynamic Indicators of Basic…	1
Embedded Figures Test	1
Flesch Reading Ease Formula	1
Graduate Management Admission…	1
Graduate Record Examinations	1
Hidden Figures Test	1
Iowa Tests of Basic Skills	1
Matching Familiar Figures Test	1
Measures of Academic Progress	1
Metropolitan Achievement Tests	1
Peabody Developmental Motor…	1
Praxis Series	1
Stanford Achievement Tests	1
Stanford Binet Intelligence…	1
More ▼