ERIC - Search Results

Publication Date

In 2026	0
Since 2025	3
Since 2022 (last 5 years)	13
Since 2017 (last 10 years)	45
Since 2007 (last 20 years)	112

Descriptor

Correlation	149
Difficulty Level	149
Test Items	149
Item Response Theory	47
Item Analysis	43
Foreign Countries	40
Comparative Analysis	38
Test Reliability	28
Scores	25
Test Construction	25
Statistical Analysis	23
Test Bias	21
Psychometrics	19
Test Validity	19
Mathematics Tests	18
Multiple Choice Tests	18
Simulation	16
Factor Analysis	15
Models	15
Test Format	15
College Entrance Examinations	14
Undergraduate Students	14
Accuracy	13
Language Tests	11
Sample Size	11
More ▼

Publication Type

Reports - Research	116
Journal Articles	102
Speeches/Meeting Papers	19
Reports - Evaluative	12
Dissertations/Theses -…	10
Tests/Questionnaires	7
Reports - Descriptive	5
Opinion Papers	3
Information Analyses	2
Numerical/Quantitative Data	2
Non-Print Media	1
Reference Materials - General	1
More ▼

Education Level

Higher Education	34
Postsecondary Education	30
Elementary Education	11
Secondary Education	9
Junior High Schools	5
Middle Schools	5
High Schools	4
Elementary Secondary Education	3
Intermediate Grades	3
Early Childhood Education	2
Grade 12	2
Grade 2	2
Grade 5	2
Grade 7	2
Primary Education	2
Grade 4	1
Grade 6	1
Grade 9	1
Two Year Colleges	1
More ▼

Audience

Researchers

Location

Indonesia	4
Turkey	4
Germany	3
Australia	2
Canada	2
South Korea	2
Belgium	1
Cyprus	1
Czech Republic	1
District of Columbia	1
Finland	1
Greece	1
Hong Kong	1
Illinois	1
Japan	1
Malaysia	1
Nebraska	1
New York	1
Nigeria	1
Ohio	1
Oman	1
Saudi Arabia	1
Taiwan	1
Tennessee	1
Turkey (Ankara)	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	11
Graduate Record Examinations	4
National Assessment of…	3
Trends in International…	3
Wechsler Intelligence Scale…	3
Program for International…	2
Advanced Placement…	1
Communication and Symbolic…	1
Defining Issues Test	1
Digit Span Test	1
Flesch Kincaid Grade Level…	1
Iowa Tests of Basic Skills	1
Peabody Picture Vocabulary…	1
SRA Achievement Series	1
Wechsler Adult Intelligence…	1
Wechsler Memory Scale	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 149 results Save | Export

Evaluation of Exam Questions Using Bootstrapping: Practical Applications in R and SPSS with a Case Study

Peer reviewed

Direct link

Changiz Mohiyeddini – Anatomical Sciences Education, 2025

This article presents a step-by-step guide to using R and SPSS to bootstrap exam questions. Bootstrapping, a versatile nonparametric analytical technique, can help to improve the psychometric qualities of exam questions in the process of quality assurance. Bootstrapping is particularly useful in disciplines such as medical education, where student…

Descriptors: Test Items, Sampling, Statistical Inference, Nonparametric Statistics

On the Positive Correlation between DIF and Difficulty: A New Theory on the Correlation as Methodological Artifact

Peer reviewed

Direct link

Bolt, Daniel M.; Liao, Xiangyi – Journal of Educational Measurement, 2021

We revisit the empirically observed positive correlation between DIF and difficulty studied by Freedle and commonly seen in tests of verbal proficiency when comparing populations of different mean latent proficiency levels. It is shown that a positive correlation between DIF and difficulty estimates is actually an expected result (absent any true…

Descriptors: Test Bias, Difficulty Level, Correlation, Verbal Tests

Somers' D as an Alternative for the Item-Test and Item-Rest Correlation Coefficients in the Educational Measurement Settings

Peer reviewed
PDF on ERIC

Download full text

Metsämuuronen, Jari – International Journal of Educational Methodology, 2020

Pearson product-moment correlation coefficient between item g and test score X, known as item-test or item-total correlation ("Rit"), and item-rest correlation ("Rir") are two of the most used classical estimators for item discrimination power (IDP). Both "Rit" and "Rir" underestimate IDP caused by the…

Descriptors: Correlation, Test Items, Scores, Difficulty Level

Development of a Four-Tier Diagnostic Test for Misconceptions in Natural Science of Primary School Pupils

Peer reviewed
PDF on ERIC

Download full text

Anatri Desstya; Ika Candra Sayekti; Muhammad Abduh; Sukartono – Journal of Turkish Science Education, 2025

This study aimed to develop a standardised instrument for diagnosing science misconceptions in primary school children. Following a developmental research approach using the 4-D model (Define, Design, Develop, Disseminate), 100 four-tier multiple choice items were constructed. Content validity was established through expert evaluation by six…

Descriptors: Test Construction, Science Tests, Science Instruction, Diagnostic Tests

Preliminary Findings to Support the Internal Consistency and Factor Structure of the Ferrari-Lynch-Vogel Listening Test (FLVLT)

Peer reviewed

Direct link

Ferrari-Bridgers, Franca – International Journal of Listening, 2023

While many tools exist to assess student content knowledge, there are few that assess whether students display the critical listening skills necessary to interpret the quality of a speaker's message at the college level. The following research provides preliminary evidence for the internal consistency and factor structure of a tool, the…

Descriptors: Factor Structure, Test Validity, Community College Students, Test Reliability

Closed Formula of Test Length Required for Adaptive Testing with Medium Probability of Solution

Peer reviewed

Direct link

Kárász, Judit T.; Széll, Krisztián; Takács, Szabolcs – Quality Assurance in Education: An International Perspective, 2023

Purpose: Based on the general formula, which depends on the length and difficulty of the test, the number of respondents and the number of ability levels, this study aims to provide a closed formula for the adaptive tests with medium difficulty (probability of solution is p = 1/2) to determine the accuracy of the parameters for each item and in…

Descriptors: Test Length, Probability, Comparative Analysis, Difficulty Level

Reliability and Validity Evidence of Diagnostic Methods: Comparison of Diagnostic Classification Models and Item Response Theory-Based Methods

Direct link

Yoo Jeong Jang – ProQuest LLC, 2022

Despite the increasing demand for diagnostic information, observed subscores have been often reported to lack adequate psychometric qualities such as reliability, distinctiveness, and validity. Therefore, several statistical techniques based on CTT and IRT frameworks have been proposed to improve the quality of subscores. More recently, DCM has…

Descriptors: Classification, Accuracy, Item Response Theory, Correlation

Examining the Relationship between the Readability and Comprehensibility of Practice Test Questions and Failure Rates on Learner's Permit Knowledge Tests

Peer reviewed

Direct link

Flint, Kaitlyn; Spaulding, Tammie J. – Language, Speech, and Hearing Services in Schools, 2021

Purpose: The readability and comprehensibility of Learner's Permit Knowledge Test practice questions and the relationship with test failure rates across states and the District of Columbia were examined. Method: Failure rates were obtained from department representatives. Practice test questions were extracted from drivers' manuals and department…

Descriptors: Correlation, Readability Formulas, Reading Comprehension, Difficulty Level

Disentangling Person-Dependent and Item-Dependent Causal Effects: Applications of Item Response Theory to the Estimation of Treatment Effect Heterogeneity

Peer reviewed

Direct link

Joshua B. Gilbert; Luke W. Miratrix; Mridul Joshi; Benjamin W. Domingue – Journal of Educational and Behavioral Statistics, 2025

Analyzing heterogeneous treatment effects (HTEs) plays a crucial role in understanding the impacts of educational interventions. A standard practice for HTE analysis is to examine interactions between treatment status and preintervention participant characteristics, such as pretest scores, to identify how different groups respond to treatment.…

Descriptors: Causal Models, Item Response Theory, Statistical Inference, Psychometrics

Investigation of the Effect of Parameter Estimation and Classification Accuracy in Mixture IRT Models under Different Conditions

Peer reviewed
PDF on ERIC

Download full text

Saatcioglu, Fatima Munevver; Atar, Hakan Yavuz – International Journal of Assessment Tools in Education, 2022

This study aims to examine the effects of mixture item response theory (IRT) models on item parameter estimation and classification accuracy under different conditions. The manipulated variables of the simulation study are set as mixture IRT models (Rasch, 2PL, 3PL); sample size (600, 1000); the number of items (10, 30); the number of latent…

Descriptors: Accuracy, Classification, Item Response Theory, Programming Languages

Disentangling Person-Dependent and Item-Dependent Causal Effects: Applications of Item Response Theory to the Estimation of Treatment Effect Heterogeneity. EdWorkingPaper No. 23-881

Download full text

Joshua B. Gilbert; Luke W. Miratrix; Mridul Joshi; Benjamin W. Domingue – Annenberg Institute for School Reform at Brown University, 2024

Analyzing heterogeneous treatment effects (HTE) plays a crucial role in understanding the impacts of educational interventions. A standard practice for HTE analysis is to examine interactions between treatment status and pre-intervention participant characteristics, such as pretest scores, to identify how different groups respond to treatment.…

Descriptors: Causal Models, Item Response Theory, Statistical Inference, Psychometrics

A Special Case of Brennan's Index for Tests That Aim to Select a Limited Number of Students: A Monte Carlo Simulation Study

Peer reviewed

Direct link

Arikan, Serkan; Aybek, Eren Can – Educational Measurement: Issues and Practice, 2022

Many scholars compared various item discrimination indices in real or simulated data. Item discrimination indices, such as item-total correlation, item-rest correlation, and IRT item discrimination parameter, provide information about individual differences among all participants. However, there are tests that aim to select a very limited number…

Descriptors: Monte Carlo Methods, Item Analysis, Correlation, Individual Differences

A Baseline for Multiple-Choice Testing in the University Classroom

Peer reviewed

Direct link

Slepkov, A. D.; Van Bussel, M. L.; Fitze, K. M.; Burr, W. S. – SAGE Open, 2021

There is a broad literature in multiple-choice test development, both in terms of item-writing guidelines, and psychometric functionality as a measurement tool. However, most of the published literature concerns multiple-choice testing in the context of expert-designed high-stakes standardized assessments, with little attention being paid to the…

Descriptors: Foreign Countries, Undergraduate Students, Student Evaluation, Multiple Choice Tests

Exploration of Student Cognitive Mathematics Ability Diagnostic Instruments: Validity, Reliability, and Item Characteristics

Peer reviewed
PDF on ERIC

Download full text

Hartono, Wahyu; Hadi, Samsul; Rosnawati, Raden; Retnawati, Heri – Pegem Journal of Education and Instruction, 2023

Researchers design diagnostic assessments to measure students' knowledge structures and processing skills to provide information about their cognitive attribute. The purpose of this study is to determine the instrument's validity and score reliability, as well as to investigate the use of classical test theory to identify item characteristics. The…

Descriptors: Diagnostic Tests, Test Validity, Item Response Theory, Content Validity

The Pattern of Test-Taking Effort across Items in Cognitive Ability Test: A Latent Class Analysis

Peer reviewed
PDF on ERIC

Download full text

Akhtar, Hanif – International Association for Development of the Information Society, 2022

When examinees perceive a test as low stakes, it is logical to assume that some of them will not put out their maximum effort. This condition makes the validity of the test results more complicated. Although many studies have investigated motivational fluctuation across tests during a testing session, only a small number of studies have…

Descriptors: Intelligence Tests, Student Motivation, Test Validity, Student Attitudes

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10

ProQuest LLC	10
Educational and Psychological…	8
ETS Research Report Series	7
Journal of Educational…	5
Applied Psychological…	3
Educational Measurement:…	3
Journal of Psychoeducational…	3
CBE - Life Sciences Education	2
Educational Research and…	2
Educational Sciences: Theory…	2
Harvard Educational Review	2
International Journal of…	2
International Journal of…	2
Journal of Experimental…	2
Journal of Experimental…	2
Journal of Research in…	2
Online Submission	2
Structural Equation Modeling:…	2
Advances in Physiology…	1
American Annals of the Deaf	1
American Educational Research…	1
Anatomical Sciences Education	1
Annenberg Institute for…	1
Applied Measurement in…	1
Asia Pacific Education Review	1
More ▼

Dorans, Neil J.	4
Holland, Paul	3
Sinharay, Sandip	3
Attali, Yigal	2
Benjamin W. Domingue	2
DeMars, Christine E.	2
Joshua B. Gilbert	2
Kobrin, Jennifer L.	2
Livingston, Samuel A.	2
Luke W. Miratrix	2
Mridul Joshi	2
Sackett, Paul R.	2
Abdellah, Antar Solhy	1
Abshire, Elizabeth	1
Abu Kassim, Noor Lide	1
Ahn, Soyeon	1
Ahonen, Timo	1
Akar, Cüneyt	1
Akhtar, Hanif	1
Aktas, Elif	1
Albano, Anthony D.	1
Allalouf, Avi	1
Anatri Desstya	1
Angeles, Victor R.	1
More ▼