ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	9

Descriptor

Difficulty Level	26
Test Items	26
Test Construction	14
Item Analysis	7
Multiple Choice Tests	7
Achievement Tests	6
Literature Reviews	6
Test Bias	6
Test Format	6
Item Response Theory	5
Responses	5
Language Tests	4
Second Language Learning	4
Student Evaluation	4
Ability	3
Educational Research	3
English (Second Language)	3
Models	3
Reading Comprehension	3
Reading Tests	3
Statistical Analysis	3
Test Interpretation	3
Test Validity	3
Accuracy	2
Adaptive Testing	2
More ▼

Source

Educational Measurement:…	2
Educational Testing Service	2
Journal of Educational…	2
Applied Measurement in…	1
Applied Psychological…	1
Assessment for Effective…	1
Educational Research Review	1
Educational and Psychological…	1
International Journal of…	1
National Center for Education…	1
Reading in a Foreign Language	1
Review of Educational Research	1
Scandinavian Journal of…	1
Studies in Second Language…	1
TESL-EJ	1
Update: Applications of…	1
More ▼

Publication Type

Information Analyses	26
Journal Articles	16
Reports - Research	12
Reports - Evaluative	6
Speeches/Meeting Papers	5
Guides - Non-Classroom	1

Education Level

Elementary Secondary Education	2
Elementary Education	1
Grade 4	1
Grade 8	1
Higher Education	1
Middle Schools	1
Secondary Education	1

Audience

Students	1
Teachers	1

Location

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
International English…	1
National Assessment of…	1
Program for International…	1
SAT (College Admission Test)	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 26 results Save | Export

Text-Based Question Difficulty Prediction: A Systematic Review of Automatic Approaches

Peer reviewed

Direct link

Samah AlKhuzaey; Floriana Grasso; Terry R. Payne; Valentina Tamma – International Journal of Artificial Intelligence in Education, 2024

Designing and constructing pedagogical tests that contain items (i.e. questions) which measure various types of skills for different levels of students equitably is a challenging task. Teachers and item writers alike need to ensure that the quality of assessment materials is consistent, if student evaluations are to be objective and effective.…

Descriptors: Test Items, Test Construction, Difficulty Level, Prediction

Text Complexity of Cambridge-Delivered IELTS Academic Reading Tests: Comparability with IELTS Academic Reading Practice Tests from Other Publishers

Peer reviewed
PDF on ERIC

Download full text

Huu Thanh Minh Nguyen; Nguyen Van Anh Le – TESL-EJ, 2024

Comparing language tests and test preparation materials holds important implications for the latter's validity and reliability. However, not enough studies compare such materials across a wide range of indices. Therefore, this study investigated the text complexity of IELTS academic reading tests (IRT) and IELTS reading practice tests (IRPrT).…

Descriptors: Second Language Learning, English (Second Language), Language Tests, Readability

Measurement Properties of a Standardized Elicited Imitation Test: An Integrative Data Analysis

Peer reviewed

Direct link

Isbell, Daniel R.; Son, Young-A – Studies in Second Language Acquisition, 2022

Elicited Imitation Tests (EITs) are commonly used in second language acquisition (SLA)/bilingualism research contexts to assess the general oral proficiency of study participants. While previous studies have provided valuable EIT construct-related validity evidence, some key gaps remain. This study uses an integrative data analysis to further…

Descriptors: Bilingualism, Imitation, Language Tests, Second Language Learning

Critical Variables in Singing Accuracy Test Construction: A Review of Literature

Peer reviewed

Direct link

Nichols, Bryan E. – Update: Applications of Research in Music Education, 2016

The purpose of this review of literature was to identify research findings for designing assessments in singing accuracy. The aim was to specify the test construction variables that directly affect test performance to guide future design in singing accuracy assessment for research and classroom uses. Three pitch-matching tasks--single pitch,…

Descriptors: Singing, Accuracy, Music, Music Education

Developing, Analyzing, and Using Distractors for Multiple-Choice Tests in Education: A Comprehensive Review

Peer reviewed

Direct link

Gierl, Mark J.; Bulut, Okan; Guo, Qi; Zhang, Xinxin – Review of Educational Research, 2017

Multiple-choice testing is considered one of the most effective and enduring forms of educational assessment that remains in practice today. This study presents a comprehensive review of the literature on multiple-choice testing in education focused, specifically, on the development, analysis, and use of the incorrect options, which are also…

Descriptors: Multiple Choice Tests, Difficulty Level, Accuracy, Error Patterns

Lessons Learned from PISA: A Systematic Review of Peer-Reviewed Articles on the Programme for International Student Assessment

Peer reviewed

Direct link

Hopfenbeck, Therese N.; Lenkeit, Jenny; El Masri, Yasmine; Cantrell, Kate; Ryan, Jeanne; Baird, Jo-Anne – Scandinavian Journal of Educational Research, 2018

International large-scale assessments are on the rise, with the Programme for International Student Assessment (PISA) seen by many as having strategic prominence in education policy debates. The present article reviews PISA-related English-language peer-reviewed articles from the programme's first cycle in 2000 to its most current in 2015. Five…

Descriptors: Foreign Countries, Achievement Tests, International Assessment, Secondary School Students

Computer-Adaptive Testing for Students with Disabilities: A Review of the Literature. Research Report. ETS RR-11-32

Download full text

Stone, Elizabeth; Davey, Tim – Educational Testing Service, 2011

There has been an increased interest in developing computer-adaptive testing (CAT) and multistage assessments for K-12 accountability assessments. The move to adaptive testing has been met with some resistance by those in the field of special education who express concern about routing of students with divergent profiles (e.g., some students with…

Descriptors: Disabilities, Adaptive Testing, Accountability, Computer Assisted Testing

A Supplement to "The Number of Guttman Errors as a Simple and Powerful Person-Fit Statistic."

Peer reviewed

Meijer, Rob R. – Applied Psychological Measurement, 1995

A statistic used by R. Meijer (1994) to determine person-fit referred to the number of errors from the deterministic Guttman model (L. Guttman, 1950), but this was, in fact, based on the number of errors from the deterministic Guttman model as defined by J. Loevinger (1947, 1948). (SLD)

Descriptors: Difficulty Level, Models, Responses, Scaling

Item-Level Effects of the Read-Aloud Accommodation for Students with Reading Disabilities

Peer reviewed

Direct link

Bolt, Sara E.; Thurlow, Martha L. – Assessment for Effective Intervention, 2007

Research support for providing a read-aloud accommodation (i.e., having an individual read test items and directions aloud) to students with disabilities has been somewhat limited, particularly when merely examining effects of the accommodation on overall test scores for general groups of students with disabilities. We examined data on…

Descriptors: Reading Difficulties, Testing Accommodations, Reading Aloud to Others, Special Needs Students

Teachers' and Students' Perceptions of Assessments: A Review and a Study into the Ability and Accuracy of Estimating the Difficulty Levels of Assessment Items

Peer reviewed

Direct link

van de Watering, Gerard; van der Rijt, Janine – Educational Research Review, 2006

In today's higher education, high quality assessments play an important role. Little is known, however, about the degree to which assessments are correctly aimed at the students' levels of competence in relation to the defined learning goals. This article reviews previous research into teachers' and students' perceptions of item difficulty. It…

Descriptors: Student Attitudes, Teacher Attitudes, College Students, College Faculty

The Role of Instructional Sensitivity in the Empirical Review of Criterion-Referenced Test Items.

Peer reviewed

Haladyna, Tom; Roid, Gale – Journal of Educational Measurement, 1981

The rationale for use of instructional sensitivity in the empirical review of test items is examined, and the results of a study that distinguishes instructional sensitivity from other item concepts are presented. Research is reviewed which indicates the existence of instructional sensitivity as a unique criterion-referenced test item concept. (RL)

Descriptors: Criterion Referenced Tests, Difficulty Level, Evaluation Criteria, Pretests Posttests

Type K and Other Complex Multiple-Choice Items: An Analysis of Research and Item Properties.

Peer reviewed

Albanese, Mark A. – Educational Measurement: Issues and Practice, 1993

A comprehensive review is given of evidence, with a bearing on the recommendation to avoid use of complex multiple choice (CMC) items. Avoiding Type K items (four primary responses and five secondary choices) seems warranted, but evidence against CMC in general is less clear. (SLD)

Descriptors: Cues, Difficulty Level, Multiple Choice Tests, Responses

Manipulating Processing Difficulty of Reading Comprehension Questions: The Feasibility of Verbal Item Generation

Peer reviewed

Direct link

Gorin, Joanna S. – Journal of Educational Measurement, 2005

Based on a previously validated cognitive processing model of reading comprehension, this study experimentally examines potential generative components of text-based multiple-choice reading comprehension test questions. Previous research (Embretson & Wetzel, 1987; Gorin & Embretson, 2005; Sheehan & Ginther, 2001) shows text encoding and decision…

Descriptors: Reaction Time, Reading Comprehension, Difficulty Level, Test Items

A Meta-Analytic Review of Item Discrimination and Difficulty in Multiple-Choice Items Using "None-of-the-Above."

Peer reviewed

Knowles, Susan L.; Welch, Cynthia A. – Educational and Psychological Measurement, 1992

A meta-analysis of the difficulty and discrimination of the "none-of-the-above" (NOTA) test option was conducted with 12 articles (20 effect sizes) for difficulty and 7 studies (11 effect sizes) for discrimination. Findings indicate that using the NOTA option does not result in items of lesser quality. (SLD)

Descriptors: Difficulty Level, Effect Size, Meta Analysis, Multiple Choice Tests

Exploring Item Characteristics That Are Related to the Difficulty of TOEFL Dialogue Items. Research Reports. RR-79. RR-04-11

Download full text

Kostin, Irene – Educational Testing Service, 2004

The purpose of this study is to explore the relationship between a set of item characteristics and the difficulty of TOEFL[R] dialogue items. Identifying characteristics that are related to item difficulty has the potential to improve the efficiency of the item-writing process The study employed 365 TOEFL dialogue items, which were coded on 49…

Descriptors: Statistical Analysis, Difficulty Level, Language Tests, English (Second Language)

Previous Page | Next Page »

Pages: 1 | 2

Haladyna, Tom	2
Roid, Gale	2
Albanese, Mark A.	1
Anderson, J. Charles	1
Baird, Jo-Anne	1
Bolden, Bernadine J.	1
Bolt, Sara E.	1
Bulut, Okan	1
Cantrell, Kate	1
Colton, Dean A.	1
Davey, Tim	1
El Masri, Yasmine	1
Floriana Grasso	1
Frisbie, David A.	1
Garavaglia, Diane R.	1
Gierl, Mark J.	1
Gorin, Joanna S.	1
Guo, Qi	1
Hopfenbeck, Therese N.	1
Huu Thanh Minh Nguyen	1
Isbell, Daniel R.	1
Knowles, Susan L.	1
Kostin, Irene	1
Lenkeit, Jenny	1
Lockheed, Marlaine E.	1
More ▼