ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	2

Source

Applied Measurement in…

Author

Coffman, Don D.	1
Dunbar, Stephen B.	1
Feldt, Leonard S.	1
Mehrens, William A.	1
Millman, Jason	1
Pastor, Dena A.	1
Popham, W. James	1
Taylor, Melinda Ann	1
Thissen, David	1
Vispoel, Walter P.	1
Wainer, Howard	1
Wise, Steven L.	1
More ▼

Publication Type

Journal Articles	8
Reports - Evaluative	6
Reports - Research	2
Speeches/Meeting Papers	1

Education Level

Elementary Education	1
Elementary Secondary Education	1
Grade 10	1
Grade 5	1
Grade 8	1
High Schools	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 8 results Save | Export

An Application of Generalizability Theory to Evaluate the Technical Quality of an Alternate Assessment

Peer reviewed

Direct link

Taylor, Melinda Ann; Pastor, Dena A. – Applied Measurement in Education, 2013

Although federal regulations require testing students with severe cognitive disabilities, there is little guidance regarding how technical quality should be established. It is known that challenges exist with documentation of the reliability of scores for alternate assessments. Typical measures of reliability do little in modeling multiple sources…

Descriptors: Generalizability Theory, Alternative Assessment, Test Reliability, Scores

The Relationship between the Distribution of Item Difficulties and Test Reliability.

Peer reviewed

Feldt, Leonard S. – Applied Measurement in Education, 1993

The recommendation that the reliability of multiple-choice tests will be enhanced if the distribution of item difficulties is concentrated at approximately 0.50 is reinforced and extended in this article by viewing the 0/1 item scoring as a dichotomization of an underlying normally distributed ability score. (SLD)

Descriptors: Ability, Difficulty Level, Guessing (Tests), Mathematical Models

An Investigation of the Differential Effort Received by Items on a Low-Stakes Computer-Based Test

Peer reviewed

Direct link

Wise, Steven L. – Applied Measurement in Education, 2006

In low-stakes testing, the motivation levels of examinees are often a matter of concern to test givers because a lack of examinee effort represents a direct threat to the validity of the test data. This study investigated the use of response time to assess the amount of examinee effort received by individual test items. In 2 studies, it was found…

Descriptors: Computer Assisted Testing, Motivation, Test Validity, Item Response Theory

Combining Multiple-Choice and Constructed-Response Test Scores: Toward a Marxist Theory of Test Construction.

Peer reviewed

Wainer, Howard; Thissen, David – Applied Measurement in Education, 1993

Because assessment instruments of the future may well be composed of a combination of types of questions, a way to combine those scores effectively is discussed. Two new graphic tools are presented that show that it may not be practical to equalize the reliability of different components. (SLD)

Descriptors: Constructed Response, Educational Assessment, Graphs, Item Response Theory

Quality Control in the Development and Use of Performance Assessments.

Peer reviewed

Dunbar, Stephen B.; And Others – Applied Measurement in Education, 1991

Issues pertaining to the quality of performance assessments, including reliability and validity, are discussed. The relatively limited generalizability of performance across tasks is indicative of the care needed to evaluate performance assessments. Quality control is an empirical matter when measurement is intended to inform public policy. (SLD)

Descriptors: Educational Assessment, Generalization, Interrater Reliability, Measurement Techniques

Teaching Licensing and the New Assessment Methodologies.

Peer reviewed

Millman, Jason – Applied Measurement in Education, 1991

Alternatives to multiple-choice tests for teacher licensing examinations are described, and their advantages are cited. Concerns are expressed in the areas of cost and practicality, reliability, corruptibility, and validity. A suggestion for reducing costs using multiple-choice responses calibrated to constructed-response tasks is proposed. (SLD)

Descriptors: Beginning Teachers, Constructed Response, Cost Effectiveness, Educational Assessment

Computerized-Adaptive and Self-Adapted Music-Listening Tests: Psychometric Features and Motivational Benefits.

Peer reviewed

Vispoel, Walter P.; Coffman, Don D. – Applied Measurement in Education, 1994

Computerized-adaptive (CAT) and self-adapted (SAT) music listening tests were compared for efficiency, reliability, validity, and motivational benefits with 53 junior high school students. Results demonstrate trade-offs, with greater potential motivational benefits for SAT and greater efficiency for CAT. SAT elicited more favorable responses from…

Descriptors: Adaptive Testing, Computer Assisted Testing, Efficiency, Item Response Theory

How to Evaluate the Legal Defensibility of High-Stakes Tests.

Peer reviewed

Mehrens, William A.; Popham, W. James – Applied Measurement in Education, 1992

This paper discusses how to determine whether a test was developed in a legally defensible manner, reviewing general issues, specific cases bearing on different types of test use, some evaluative dimensions, and evidence of test quality. Tests constructed and used according to existing standards will generally stand legal scrutiny. (SLD)

Descriptors: College Entrance Examinations, Compliance (Legal), Constitutional Law, Court Litigation

Test Construction	8
Test Reliability	8
Test Validity	5
Scores	4
Educational Assessment	3
Item Response Theory	3
Multiple Choice Tests	3
Computer Assisted Testing	2
Constructed Response	2
Guessing (Tests)	2
Performance Based Assessment	2
Test Use	2
Ability	1
Adaptive Testing	1
Alternative Assessment	1
Beginning Teachers	1
College Entrance Examinations	1
Compliance (Legal)	1
Constitutional Law	1
Cost Effectiveness	1
Court Litigation	1
Cutting Scores	1
Decision Making	1
Difficulty Level	1
Disabilities	1
More ▼