ERIC - Search Results

Publication Date

In 2025	124
Since 2024	373

Descriptor

Test Items	373
Foreign Countries	145
Test Construction	138
Test Validity	97
Item Response Theory	91
Item Analysis	82
Test Reliability	73
Computer Assisted Testing	57
Difficulty Level	55
Psychometrics	52
Multiple Choice Tests	47
Scores	46
Language Tests	44
Factor Analysis	43
Artificial Intelligence	41
Accuracy	40
English (Second Language)	38
Measures (Individuals)	38
Science Tests	38
Second Language Learning	38
Test Format	37
College Students	35
Comparative Analysis	34
Mathematics Tests	34
Student Evaluation	32
More ▼

Publication Type

Journal Articles	335
Reports - Research	315
Tests/Questionnaires	20
Reports - Descriptive	19
Information Analyses	18
Reports - Evaluative	15
Dissertations/Theses -…	10
Speeches/Meeting Papers	5
Books	2
Collected Works - General	2
Numerical/Quantitative Data	1
More ▼

Education Level

Higher Education	105
Postsecondary Education	105
Secondary Education	74
Elementary Education	45
Middle Schools	29
Junior High Schools	27
High Schools	19
Early Childhood Education	15
Grade 8	12
Primary Education	12
Intermediate Grades	11
Elementary Secondary Education	10
Grade 4	8
Grade 3	7
Grade 10	4
Grade 2	4
Grade 1	3
Grade 5	3
Grade 6	3
Grade 7	3
Preschool Education	3
Adult Education	2
Grade 9	2
Kindergarten	2
Grade 11	1
More ▼

Audience

Administrators	1
Counselors	1
Policymakers	1
Practitioners	1
Researchers	1
Teachers	1

Location

Turkey	15
China	12
Indonesia	12
Iran	9
United Kingdom	8
Germany	7
Japan	7
United States	7
South Africa	5
Taiwan	5
United Kingdom (England)	5
Ghana	3
Italy	3
Nigeria	3
Norway	3
Saudi Arabia	3
South Korea	3
Spain	3
Sweden	3
Thailand	3
Vietnam	3
Australia	2
Bosnia and Herzegovina	2
Brazil	2
Canada	2
More ▼

Laws, Policies, & Programs

Head Start

What Works Clearinghouse Rating

Showing 1 to 15 of 373 results Save | Export

A Comparison of Anchor Selection Strategies for DIF Analysis

Peer reviewed

Direct link

Haeju Lee; Kyung Yong Kim – Journal of Educational Measurement, 2025

When no prior information of differential item functioning (DIF) exists for items in a test, either the rank-based or iterative purification procedure might be preferred. The rank-based purification selects anchor items based on a preliminary DIF test. For a preliminary DIF test, likelihood ratio test (LRT) based approaches (e.g.,…

Descriptors: Test Items, Equated Scores, Test Bias, Accuracy

Simultaneous Linear Equating for Scenarios with Optional Test Versions or across Multiple Alternative Anchors

Peer reviewed
PDF on ERIC

Download full text

Tom Benton – Practical Assessment, Research & Evaluation, 2025

This paper proposes an extension of linear equating that may be useful in one of two fairly common assessment scenarios. One is where different students have taken different combinations of test forms. This might occur, for example, where students have some free choice over the exam papers they take within a particular qualification. In this…

Descriptors: Equated Scores, Test Format, Test Items, Computation

Another Look at Yen's Q3: Is 0.2 an Appropriate Cut-Off?

Peer reviewed

Direct link

Kelsey Nason; Christine DeMars – Journal of Educational Measurement, 2025

This study examined the widely used threshold of 0.2 for Yen's Q3, an index for violations of local independence. Specifically, a simulation was conducted to investigate whether Q3 values were related to the magnitude of bias in estimates of reliability, item parameters, and examinee ability. Results showed that Q3 values below the typical cut-off…

Descriptors: Item Response Theory, Statistical Bias, Test Reliability, Test Items

A Review of Automatic Item Generation Techniques Leveraging Large Language Models

Peer reviewed
PDF on ERIC

Download full text

Bin Tan; Nour Armoush; Elisabetta Mazzullo; Okan Bulut; Mark J. Gierl – International Journal of Assessment Tools in Education, 2025

This study reviews existing research on the use of large language models (LLMs) for automatic item generation (AIG). We performed a comprehensive literature search across seven research databases, selected studies based on predefined criteria, and summarized 60 relevant studies that employed LLMs in the AIG process. We identified the most commonly…

Descriptors: Artificial Intelligence, Test Items, Automation, Test Format

A Workflow for Minimizing Errors in Template-Based Automated Item-Generation Development

Peer reviewed

Direct link

Yanyan Fu – Educational Measurement: Issues and Practice, 2024

The template-based automated item-generation (TAIG) approach that involves template creation, item generation, item selection, field-testing, and evaluation has more steps than the traditional item development method. Consequentially, there is more margin for error in this process, and any template errors can be cascaded to the generated items.…

Descriptors: Error Correction, Automation, Test Items, Test Construction

Generalizability Theory Approach to Analyzing Automated-Item Generated Test Forms

Peer reviewed

Direct link

Stella Y. Kim; Sungyeun Kim – Educational Measurement: Issues and Practice, 2025

This study presents several multivariate Generalizability theory designs for analyzing automatic item-generated (AIG) based test forms. The study used real data to illustrate the analysis procedure and discuss practical considerations. We collected the data from two groups of students, each group receiving a different form generated by AIG. A…

Descriptors: Generalizability Theory, Automation, Test Items, Students

The Frequency, Type, and Function of Visual Displays in Upper Elementary Standardized Science Tests

Peer reviewed

Direct link

Daibao Guo; Katherine Landau Wright; Lianne Josbacher; Eun Hye Son – Elementary School Journal, 2025

Limited research has explored the use of visual displays (ViDis) in science tests, making it challenging to know how these tests align with classroom instruction and what skills students need to be successful on these tests. Therefore, the current study aims to describe the use of ViDis in upper elementary grade standardized science tests. We…

Descriptors: Standardized Tests, Science Tests, Elementary Education, Science Education

Review on Neural Question Generation for Education Purposes

Peer reviewed

Direct link

Said Al Faraby; Adiwijaya Adiwijaya; Ade Romadhony – International Journal of Artificial Intelligence in Education, 2024

Questioning plays a vital role in education, directing knowledge construction and assessing students' understanding. However, creating high-level questions requires significant creativity and effort. Automatic question generation is expected to facilitate the generation of not only fluent and relevant but also educationally valuable questions.…

Descriptors: Test Items, Automation, Computer Software, Input Output Analysis

Item-Writing Guidelines on Response Option Placement: A Systematic Review

Peer reviewed

Direct link

Séverin Lions; María Paz Blanco; Pablo Dartnell; Carlos Monsalve; Gabriel Ortega; Julie Lemarié – Applied Measurement in Education, 2024

Multiple-choice items are universally used in formal education. Since they should assess learning, not test-wiseness or guesswork, they must be constructed following the highest possible standards. Hundreds of item-writing guides have provided guidelines to help test developers adopt appropriate strategies to define the distribution and sequence…

Descriptors: Test Construction, Multiple Choice Tests, Guidelines, Test Items

Item Parameter Estimation of the 2PL IRT Model with Fixed Ability Estimates: Choices of Ability Estimation Methods and Priors on Slopes

Peer reviewed
PDF on ERIC

Download full text

Jianbin Fu; TsungHan Ho; Xuan Tan – Practical Assessment, Research & Evaluation, 2025

Item parameter estimation using an item response theory (IRT) model with fixed ability estimates is useful in equating with small samples on anchor items. The current study explores the impact of three ability estimation methods (weighted likelihood estimation [WLE], maximum a posteriori [MAP], and posterior ability distribution estimation [PST])…

Descriptors: Item Response Theory, Test Items, Computation, Equated Scores

Are the Steps on Likert Scales Equidistant? Responses on Visual Analog Scales Allow Estimating Their Distances

Peer reviewed

Direct link

Miguel A. García-Pérez – Educational and Psychological Measurement, 2024

A recurring question regarding Likert items is whether the discrete steps that this response format allows represent constant increments along the underlying continuum. This question appears unsolvable because Likert responses carry no direct information to this effect. Yet, any item administered in Likert format can identically be administered…

Descriptors: Likert Scales, Test Construction, Test Items, Item Analysis

EQGG: Automatic Question Group Generation

Peer reviewed

Direct link

Po-Chun Huang; Ying-Hong Chan; Ching-Yu Yang; Hung-Yuan Chen; Yao-Chung Fan – IEEE Transactions on Learning Technologies, 2024

Question generation (QG) task plays a crucial role in adaptive learning. While significant QG performance advancements are reported, the existing QG studies are still far from practical usage. One point that needs strengthening is to consider the generation of question group, which remains untouched. For forming a question group, intrafactors…

Descriptors: Automation, Test Items, Computer Assisted Testing, Test Construction

Optimal Calibration of Items for Multidimensional Achievement Tests

Peer reviewed

Direct link

Mahmood Ul Hassan; Frank Miller – Journal of Educational Measurement, 2024

Multidimensional achievement tests are recently gaining more importance in educational and psychological measurements. For example, multidimensional diagnostic tests can help students to determine which particular domain of knowledge they need to improve for better performance. To estimate the characteristics of candidate items (calibration) for…

Descriptors: Multidimensional Scaling, Achievement Tests, Test Items, Test Construction

Using Content Relevance and Representativeness Indices in Instrument Revision

Peer reviewed

Direct link

Anne Traynor; Sara C. Christopherson – Applied Measurement in Education, 2024

Combining methods from earlier content validity and more contemporary content alignment studies may allow a more complete evaluation of the meaning of test scores than if either set of methods is used on its own. This article distinguishes item relevance indices in the content validity literature from test representativeness indices in the…

Descriptors: Test Validity, Test Items, Achievement Tests, Test Construction

Embedding Embedded Standard Setting: An Application of Cross-Classified Item Response Theory. CRESST Report 876

Download full text

Yun-Kyung Kim; Li Cai – National Center for Research on Evaluation, Standards, and Student Testing (CRESST), 2025

This paper introduces an application of cross-classified item response theory (IRT) modeling to an assessment utilizing the embedded standard setting (ESS) method (Lewis & Cook). The cross-classified IRT model is used to treat both item and person effects as random, where the item effects are regressed on the target performance levels (target…

Descriptors: Standard Setting (Scoring), Item Response Theory, Test Items, Difficulty Level

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 25

Education and Information…	13
Educational and Psychological…	13
Journal of Educational and…	13
Journal of Educational…	12
Grantee Submission	11
International Journal of…	11
Applied Measurement in…	10
ProQuest LLC	10
International Journal of…	8
Educational Measurement:…	6
Measurement:…	6
Annenberg Institute for…	5
International Journal of…	5
Journal of Autism and…	5
Journal of Baltic Science…	5
Language Testing	5
Large-scale Assessments in…	5
Chemistry Education Research…	4
IEEE Transactions on Learning…	4
International Journal of…	4
Journal of Applied Research…	4
Journal of Computer Assisted…	4
Language Assessment Quarterly	4
Language Testing in Asia	4
Practical Assessment,…	4
More ▼

Joshua B. Gilbert	5
Luke W. Miratrix	5
Benjamin W. Domingue	4
Kuan-Yu Jin	4
Okan Bulut	4
Allan S. Cohen	3
Jianbin Fu	3
Paul De Boeck	3
Selcuk Acar	3
Tim Stoeckel	3
Xuan Tan	3
Amanda Goodwin	2
Amy Briesch	2
Andrew D. Ho	2
Brittany Melo	2
Chun Wang	2
Diyorjon Abdullaev	2
Emma Marsden	2
Gongjun Xu	2
Guher Gorgun	2
Heri Retnawati	2
Hung Tan Ha	2
Jacqueline M. Caemmerer	2
James S. Kim	2
Jesper Tijmstra	2
More ▼

Program for International…	12
International English…	5
Trends in International…	4
Measures of Academic Progress	3
ACT Assessment	2
Big Five Inventory	2
National Assessment of…	2
Progress in International…	2
Remote Associates Test	2
Social Skills Improvement…	2
Watson Glaser Critical…	2
Ages and Stages Questionnaires	1
California Critical Thinking…	1
Career Thoughts Inventory	1
Cornell Critical Thinking Test	1
Flesch Reading Ease Formula	1
Force Concept Inventory	1
General Social Survey	1
High School Longitudinal…	1
International Civic and…	1
NEO Personality Inventory	1
Pearson Test of English…	1
Phonological Awareness…	1
Program for the International…	1
Stages of Concern…	1
More ▼