ERIC - Search Results

Publication Date

In 2025	1
Since 2024	8
Since 2021 (last 5 years)	21
Since 2016 (last 10 years)	55
Since 2006 (last 20 years)	95

Descriptor

Comparative Analysis	118
Correlation	118
Test Items	118
Foreign Countries	39
Difficulty Level	38
Scores	34
Item Response Theory	31
Item Analysis	28
Statistical Analysis	23
Language Tests	18
English (Second Language)	17
Factor Analysis	17
Second Language Learning	16
Mathematics Tests	15
Test Format	15
College Students	14
Simulation	14
Test Bias	14
Test Reliability	14
Accuracy	13
Achievement Tests	13
Multiple Choice Tests	13
Test Construction	13
Reliability	12
Computer Assisted Testing	11
More ▼

Publication Type

Reports - Research	95
Journal Articles	82
Speeches/Meeting Papers	15
Dissertations/Theses -…	10
Tests/Questionnaires	10
Reports - Evaluative	8
Reports - Descriptive	3
Numerical/Quantitative Data	2
Books	1
Information Analyses	1
Non-Print Media	1
Reference Materials - General	1
More ▼

Education Level

Higher Education	38
Postsecondary Education	32
Secondary Education	13
Elementary Education	11
High Schools	9
Elementary Secondary Education	5
Middle Schools	5
Grade 8	4
Junior High Schools	4
Early Childhood Education	3
Grade 4	3
Grade 9	3
Intermediate Grades	3
Primary Education	3
Grade 3	2
Grade 7	2
Grade 10	1
Grade 11	1
Grade 12	1
Grade 2	1
Grade 5	1
Grade 6	1
Kindergarten	1
More ▼

Audience

Researchers	2
Practitioners	1
Students	1

Location

Taiwan	4
Germany	3
Japan	3
South Korea	3
Turkey	3
Canada	2
China	2
Nigeria	2
Philippines	2
Thailand	2
Vietnam	2
Australia	1
Botswana	1
Chile	1
China (Shanghai)	1
Czech Republic	1
District of Columbia	1
Europe	1
Georgia Republic	1
India	1
Indonesia	1
Israel	1
Malaysia	1
Montana	1
New York	1
More ▼

Laws, Policies, & Programs

What Works Clearinghouse Rating

Showing 1 to 15 of 118 results Save | Export

Detecting Differential Item Functioning with Multiple Causes: A Comparison of Three Methods

Peer reviewed

Direct link

Xiaowen Liu – International Journal of Testing, 2024

Differential item functioning (DIF) often arises from multiple sources. Within the context of multidimensional item response theory, this study examined DIF items with varying secondary dimensions using the three DIF methods: SIBTEST, Mantel-Haenszel, and logistic regression. The effect of the number of secondary dimensions on DIF detection rates…

Descriptors: Item Analysis, Test Items, Item Response Theory, Correlation

A Comparison of Yen's Q3 Coefficient and Rasch Testlet Modeling for Identifying Local Item Dependence: Evidence from Two Vocabulary Matching Tests

Peer reviewed

Direct link

Hung Tan Ha; Duyen Thi Bich Nguyen; Tim Stoeckel – Language Assessment Quarterly, 2025

This article compares two methods for detecting local item dependence (LID): residual correlation examination and Rasch testlet modeling (RTM), in a commonly used 3:6 matching format and an extended matching test (EMT) format. The two formats are hypothesized to facilitate different levels of item dependency due to differences in the number of…

Descriptors: Comparative Analysis, Language Tests, Test Items, Item Analysis

A Comparison of Latent Semantic Analysis and Latent Dirichlet Allocation in Educational Measurement

Peer reviewed

Direct link

Jordan M. Wheeler; Allan S. Cohen; Shiyu Wang – Journal of Educational and Behavioral Statistics, 2024

Topic models are mathematical and statistical models used to analyze textual data. The objective of topic models is to gain information about the latent semantic space of a set of related textual data. The semantic space of a set of textual data contains the relationship between documents and words and how they are used. Topic models are becoming…

Descriptors: Semantics, Educational Assessment, Evaluators, Reliability

An Evaluation of Fit Indices Used in Model Selection of Dichotomous Mixture IRT Models

Peer reviewed

Direct link

Sedat Sen; Allan S. Cohen – Educational and Psychological Measurement, 2024

A Monte Carlo simulation study was conducted to compare fit indices used for detecting the correct latent class in three dichotomous mixture item response theory (IRT) models. Ten indices were considered: Akaike's information criterion (AIC), the corrected AIC (AICc), Bayesian information criterion (BIC), consistent AIC (CAIC), Draper's…

Descriptors: Goodness of Fit, Item Response Theory, Sample Size, Classification

The Concurrent Validity of Comparative Judgement Outcomes Compared with Marks

Download full text

Gill, Tim – Research Matters, 2022

In Comparative Judgement (CJ) exercises, examiners are asked to look at a selection of candidate scripts (with marks removed) and order them in terms of which they believe display the best quality. By including scripts from different examination sessions, the results of these exercises can be used to help with maintaining standards. Results from…

Descriptors: Comparative Analysis, Decision Making, Scripts, Standards

Estimating the Impact of Local Item Dependency in a Test of Second Language Reading Comprehension

Peer reviewed
PDF on ERIC

Download full text

Tim Stoeckel; Liang Ye Tan; Hung Tan Ha; Nam Thi Phuong Ho; Tomoko Ishii; Young Ae Kim; Chunmei Huang; Stuart McLean – Vocabulary Learning and Instruction, 2024

Local item dependency (LID) occurs when test-takers' responses to one test item are affected by their responses to another. It can be problematic if it causes inflated reliability estimates or distorted person and item measures. The cued-recall reading comprehension test in Hu and Nation's (2000) well-known and influential coverage--comprehension…

Descriptors: Reading Comprehension, English (Second Language), Second Language Instruction, Second Language Learning

Reliability and Validity Evidence of Diagnostic Methods: Comparison of Diagnostic Classification Models and Item Response Theory-Based Methods

Direct link

Yoo Jeong Jang – ProQuest LLC, 2022

Despite the increasing demand for diagnostic information, observed subscores have been often reported to lack adequate psychometric qualities such as reliability, distinctiveness, and validity. Therefore, several statistical techniques based on CTT and IRT frameworks have been proposed to improve the quality of subscores. More recently, DCM has…

Descriptors: Classification, Accuracy, Item Response Theory, Correlation

Closed Formula of Test Length Required for Adaptive Testing with Medium Probability of Solution

Peer reviewed

Direct link

Kárász, Judit T.; Széll, Krisztián; Takács, Szabolcs – Quality Assurance in Education: An International Perspective, 2023

Purpose: Based on the general formula, which depends on the length and difficulty of the test, the number of respondents and the number of ability levels, this study aims to provide a closed formula for the adaptive tests with medium difficulty (probability of solution is p = 1/2) to determine the accuracy of the parameters for each item and in…

Descriptors: Test Length, Probability, Comparative Analysis, Difficulty Level

Reliability and Validity of Methods to Assess Undergraduate Healthcare Student Performance in Pharmacology: Comparison of Open Book versus Time-Limited Closed Book Examinations

Peer reviewed
PDF on ERIC

Download full text

David Bell; Vikki O'Neill; Vivienne Crawford – Practitioner Research in Higher Education, 2023

We compared the influence of open-book extended duration versus closed book time-limited format on reliability and validity of written assessments of pharmacology learning outcomes within our medical and dental courses. Our dental cohort undertake a mid-year test (30xfree-response short answer to a question, SAQ) and end-of-year paper (4xSAQ,…

Descriptors: Undergraduate Students, Pharmacology, Pharmaceutical Education, Test Format

Assessing, Accommodating, and Guiding English Learners: A Collection of Studies

Direct link

Stephanie B. Moore – ProQuest LLC, 2024

This three-manuscript dissertation attempts to answer the question: "How does students' English language proficiency (ELP) inform the availability, structure, and use of English language accommodations and intervention to support the academic achievement of English learner (EL) students?" The question is addressed using three independent…

Descriptors: English Language Learners, Language Proficiency, English (Second Language), Second Language Learning

Student Performance and Exam Quality in Student- versus Instructor-Created Exams in Human Physiology

Peer reviewed
PDF on ERIC

Download full text

Laura S. Kabiri; Catherine R. Barber; Thomas M. McCabe; Augusto X. Rodriguez – HAPS Educator, 2024

Multiple-choice questions (MCQs) are commonly used in undergraduate introductory science, technology, engineering, and mathematics (STEM) courses, and substantial evidence supports the use of student-created questions to promote learning. However, research on student-created MCQ exams as an assessment method is more limited, and no studies have…

Descriptors: Physiology, Science Tests, Student Developed Materials, Test Construction

The Pattern of Test-Taking Effort across Items in Cognitive Ability Test: A Latent Class Analysis

Peer reviewed
PDF on ERIC

Download full text

Akhtar, Hanif – International Association for Development of the Information Society, 2022

When examinees perceive a test as low stakes, it is logical to assume that some of them will not put out their maximum effort. This condition makes the validity of the test results more complicated. Although many studies have investigated motivational fluctuation across tests during a testing session, only a small number of studies have…

Descriptors: Intelligence Tests, Student Motivation, Test Validity, Student Attitudes

Item-Score Reliability in Empirical-Data Sets and Its Relationship with Other Item Indices

Peer reviewed

Direct link

Zijlmans, Eva A. O.; Tijmstra, Jesper; van der Ark, L. Andries; Sijtsma, Klaas – Educational and Psychological Measurement, 2018

Reliability is usually estimated for a total score, but it can also be estimated for item scores. Item-score reliability can be useful to assess the repeatability of an individual item score in a group. Three methods to estimate item-score reliability are discussed, known as method MS, method [lambda][subscript 6], and method CA. The item-score…

Descriptors: Test Items, Test Reliability, Correlation, Comparative Analysis

Addressing Uncodable Behaviors: A Bayesian Ordinal Mixture Model Applied to a Mathematics Learning Trajectory Teaching Experiment

Peer reviewed

Direct link

Pavel Chernyavskiy; Traci S. Kutaka; Carson Keeter; Julie Sarama; Douglas Clements – Grantee Submission, 2024

When researchers code behavior that is undetectable or falls outside of the validated ordinal scale, the resultant outcomes often suffer from informative missingness. Incorrect analysis of such data can lead to biased arguments around efficacy and effectiveness in the context of experimental and intervention research. Here, we detail a new…

Descriptors: Bayesian Statistics, Mathematics Instruction, Learning Trajectories, Item Response Theory

Sparse Factor Autoencoders for Item Response Theory

Peer reviewed
PDF on ERIC

Download full text

PaaBen, Benjamin; Dywel, Malwina; Fleckenstein, Melanie; Pinkwart, Niels – International Educational Data Mining Society, 2022

Item response theory (IRT) is a popular method to infer student abilities and item difficulties from observed test responses. However, IRT struggles with two challenges: How to map items to skills if multiple skills are present? And how to infer the ability of new students that have not been part of the training data? Inspired by recent advances…

Descriptors: Item Response Theory, Test Items, Item Analysis, Inferences

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8

ProQuest LLC	10
ETS Research Report Series	6
Educational and Psychological…	5
Journal of Educational…	4
Applied Psychological…	3
College Entrance Examination…	3
Educational Research and…	3
Assessment & Evaluation in…	2
International Journal of…	2
International Journal of…	2
Journal of Educational and…	2
Language Assessment Quarterly	2
Language Testing	2
Accounting Education	1
American Journal of…	1
Applied Measurement in…	1
Asia Pacific Education Review	1
Assessment in Education:…	1
Bilingual Research Journal	1
Biochemistry and Molecular…	1
British Journal of…	1
College Board	1
Creativity Research Journal	1
Cultural Studies of Science…	1
Educational Evaluation and…	1
More ▼

Stricker, Lawrence J.	3
Allan S. Cohen	2
Fu, Jianbin	2
Hung Tan Ha	2
O'Neal, Marcia R.	2
Sinharay, Sandip	2
Tim Stoeckel	2
Abrams, Eleanor	1
Acar, Tülin	1
Ahmed, Tamim	1
Akar, Cüneyt	1
Akhtar, Hanif	1
Aktas, Elif	1
Aliyu, Hassan	1
Allen, Nancy	1
Almehrizi, Rashid S.	1
Altinbas, Mehmet Emre	1
Anwyll, Steve	1
Ariel, Adelaide	1
Arth, Thomas O.	1
Attali, Yigal	1
Augusto X. Rodriguez	1
Awwad, Abeer	1
Baghi, Heibatollah	1
More ▼

SAT (College Admission Test)	4
Trends in International…	3
Progress in International…	2
Test of English for…	2
Advanced Placement…	1
Armed Services Vocational…	1
Defining Issues Test	1
Differential Aptitude Test	1
Digit Span Test	1
General Aptitude Test Battery	1
Graduate Record Examinations	1
Law School Admission Test	1
Measures of Academic Progress	1
Minnesota Multiphasic…	1
National Assessment of…	1
Program for International…	1
Test of Standard Written…	1
Vocational Preference…	1
Wechsler Adult Intelligence…	1
Wechsler Intelligence Scale…	1
More ▼