ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	5

Descriptor

Difficulty Level	17
Test Construction	17
Test Items	14
Test Format	7
Achievement Tests	5
Item Analysis	5
Literature Reviews	5
Multiple Choice Tests	5
Models	3
Responses	3
Test Bias	3
Test Validity	3
Accuracy	2
Criterion Referenced Tests	2
Educational Testing	2
Elementary Secondary Education	2
English (Second Language)	2
Equated Scores	2
Higher Education	2
Item Response Theory	2
Measurement Techniques	2
Pretests Posttests	2
Psychometrics	2
Second Language Learning	2
Sex Differences	2
More ▼

Source

Educational Measurement:…	2
Assessment & Evaluation in…	1
Educational and Psychological…	1
International Journal of…	1
International Multilingual…	1
Journal of Educational…	1
Reading in a Foreign Language	1
Review of Educational Research	1
Update: Applications of…	1

Publication Type

Information Analyses	17
Journal Articles	10
Reports - Research	7
Speeches/Meeting Papers	5
Reports - Evaluative	4
Guides - Non-Classroom	1

Education Level

Early Childhood Education	1
Elementary Education	1
Elementary Secondary Education	1
Higher Education	1
Kindergarten	1
Primary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment

What Works Clearinghouse Rating

Showing 1 to 15 of 17 results Save | Export

Text-Based Question Difficulty Prediction: A Systematic Review of Automatic Approaches

Peer reviewed

Direct link

Samah AlKhuzaey; Floriana Grasso; Terry R. Payne; Valentina Tamma – International Journal of Artificial Intelligence in Education, 2024

Designing and constructing pedagogical tests that contain items (i.e. questions) which measure various types of skills for different levels of students equitably is a challenging task. Teachers and item writers alike need to ensure that the quality of assessment materials is consistent, if student evaluations are to be objective and effective.…

Descriptors: Test Items, Test Construction, Difficulty Level, Prediction

Authentic Assessment: Creating a Blueprint for Course Design

Peer reviewed

Direct link

Villarroel, Verónica; Bloxham, Susan; Bruna, Daniela; Bruna, Carola; Herrera-Seda, Constanza – Assessment & Evaluation in Higher Education, 2018

Authenticity has been identified as a key characteristic of assessment design which promotes learning. Authentic assessment aims to replicate the tasks and performance standards typically found in the world of work, and has been found to have a positive impact on student learning, autonomy, motivation, self-regulation and metacognition; abilities…

Descriptors: Performance Based Assessment, Barriers, Higher Education, Models

Assessment Accommodations for Emergent Bilinguals in Mainstream Classroom Assessments: A Targeted Literature Review

Peer reviewed

Direct link

Yang, Xuexue – International Multilingual Research Journal, 2020

Despite the importance of assessment accommodations, little is known about its use in the context of classroom assessments. To provide guidance for teachers on how to best support their emergent bilinguals during classroom assessments, there may be ideas from large-scale assessments that can be used in the classrooms. This article, a targeted…

Descriptors: Testing Accommodations, Measurement, Bilingualism, Second Language Learning

Critical Variables in Singing Accuracy Test Construction: A Review of Literature

Peer reviewed

Direct link

Nichols, Bryan E. – Update: Applications of Research in Music Education, 2016

The purpose of this review of literature was to identify research findings for designing assessments in singing accuracy. The aim was to specify the test construction variables that directly affect test performance to guide future design in singing accuracy assessment for research and classroom uses. Three pitch-matching tasks--single pitch,…

Descriptors: Singing, Accuracy, Music, Music Education

Developing, Analyzing, and Using Distractors for Multiple-Choice Tests in Education: A Comprehensive Review

Peer reviewed

Direct link

Gierl, Mark J.; Bulut, Okan; Guo, Qi; Zhang, Xinxin – Review of Educational Research, 2017

Multiple-choice testing is considered one of the most effective and enduring forms of educational assessment that remains in practice today. This study presents a comprehensive review of the literature on multiple-choice testing in education focused, specifically, on the development, analysis, and use of the incorrect options, which are also…

Descriptors: Multiple Choice Tests, Difficulty Level, Accuracy, Error Patterns

The Role of Instructional Sensitivity in the Empirical Review of Criterion-Referenced Test Items.

Peer reviewed

Haladyna, Tom; Roid, Gale – Journal of Educational Measurement, 1981

The rationale for use of instructional sensitivity in the empirical review of test items is examined, and the results of a study that distinguishes instructional sensitivity from other item concepts are presented. Research is reviewed which indicates the existence of instructional sensitivity as a unique criterion-referenced test item concept. (RL)

Descriptors: Criterion Referenced Tests, Difficulty Level, Evaluation Criteria, Pretests Posttests

Type K and Other Complex Multiple-Choice Items: An Analysis of Research and Item Properties.

Peer reviewed

Albanese, Mark A. – Educational Measurement: Issues and Practice, 1993

A comprehensive review is given of evidence, with a bearing on the recommendation to avoid use of complex multiple choice (CMC) items. Avoiding Type K items (four primary responses and five secondary choices) seems warranted, but evidence against CMC in general is less clear. (SLD)

Descriptors: Cues, Difficulty Level, Multiple Choice Tests, Responses

Basic Concepts in Modern Methods of Test Equating.

Download full text

Woldbeck, Tanya – 1998

This paper summarizes some of the basic concepts in test equating. Various types of equating methods, as well as data collection designs, are outlined, with attempts to provide insight into preferred methods and techniques. Test equating describes a group of methods that enable test constructors and users to compare scores from two different forms…

Descriptors: Comparative Analysis, Data Collection, Difficulty Level, Equated Scores

A Meta-Analytic Review of Item Discrimination and Difficulty in Multiple-Choice Items Using "None-of-the-Above."

Peer reviewed

Knowles, Susan L.; Welch, Cynthia A. – Educational and Psychological Measurement, 1992

A meta-analysis of the difficulty and discrimination of the "none-of-the-above" (NOTA) test option was conducted with 12 articles (20 effect sizes) for difficulty and 7 studies (11 effect sizes) for discrimination. Findings indicate that using the NOTA option does not result in items of lesser quality. (SLD)

Descriptors: Difficulty Level, Effect Size, Meta Analysis, Multiple Choice Tests

Testing Reading Comprehension Skills (Part One).

Peer reviewed

Anderson, J. Charles – Reading in a Foreign Language, 1990

Focuses on the performance levels of the Test of English for Educational Purposes (TEEP) and English Language Testing Service (ELTS) tests. It is concluded that more attention should be paid to the process underlying test performance. (15 references) (GLR)

Descriptors: Classification, Difficulty Level, English (Second Language), Language Tests

The Multiple True-False Item Format: A Status Review.

Peer reviewed

Frisbie, David A. – Educational Measurement: Issues and Practice, 1992

Literature related to the multiple true-false (MTF) item format is reviewed. Each answer cluster of a MTF item may have several true items and the correctness of each is judged independently. MTF tests appear efficient and reliable, although they are a bit harder than multiple choice items for examinees. (SLD)

Descriptors: Achievement Tests, Difficulty Level, Literature Reviews, Multiple Choice Tests

On Examinee Choice in Educational Testing. GRE Board Professional Report No. 91-17P.

Download full text

Wainer, Howard; Thissen, David – 1994

When an examination consists in whole or part of constructed response test items, it is common practice to allow the examinee to choose a subset of the constructed response questions from a larger pool. It is sometimes argued that, if choice were not allowed, the limitations on domain coverage forced by the small number of items might unfairly…

Descriptors: Constructed Response, Difficulty Level, Educational Testing, Equated Scores

Does the Rasch Model Really Work? A Discussion for Practitioners. ERIC/TM Report 67.

Download full text

Rentz, R. Robert; Rentz, Charlotte C. – 1978

Issues of concern to test developers interested in applying the Rasch model are discussed. The current state of the art, recommendations for use of the model, further needs, and controversies are described for the three stages of test construction: (1) definition of the content of the test and item writing; (2) item analysis; and (3) test…

Descriptors: Ability, Achievement Tests, Difficulty Level, Goodness of Fit

A Multivariate Generalizability Analysis of the 1989 and 1990 AAP Mathematics Test Forms with Respect to the Table of Specifications.

Download full text

Colton, Dean A. – 1993

Tables of specifications are used to guide test developers in sampling items and maintaining consistency from form to form. This paper is a generalizability study of the American College Testing Program (ACT) Achievement Program Mathematics Test (AAP), with the content areas of the table of specifications representing multiple dependent variables.…

Descriptors: Achievement Tests, Difficulty Level, Error of Measurement, Generalizability Theory

The Role of Instructional Sensitivity in the Empirical Review of Criterion-Referenced Test Items.

Haladyna, Tom; Roid, Gale – 1980

An empirical review of test items is described as an essential step in criterion-referenced test development. The concept of test items' instructional sensitivity is introduced, and research is briefly reviewed which describes four theoretical contexts in which instructional sensitivity indexes have been observed: criterion-referenced; classical…

Descriptors: Achievement Tests, Bayesian Statistics, Course Objectives, Criterion Referenced Tests

Previous Page | Next Page »

Pages: 1 | 2

Haladyna, Tom	2
Roid, Gale	2
Albanese, Mark A.	1
Anderson, J. Charles	1
Bloxham, Susan	1
Bolden, Bernadine J.	1
Bruna, Carola	1
Bruna, Daniela	1
Bulut, Okan	1
Colton, Dean A.	1
Floriana Grasso	1
Frisbie, David A.	1
Gierl, Mark J.	1
Guo, Qi	1
Herrera-Seda, Constanza	1
Knowles, Susan L.	1
Lockheed, Marlaine E.	1
Nichols, Bryan E.	1
Rentz, Charlotte C.	1
Rentz, R. Robert	1
Samah AlKhuzaey	1
Stoddard, Ann	1
Terry R. Payne	1
Thissen, David	1
Valentina Tamma	1
More ▼