Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 2 |
Descriptor
Source
Applied Measurement in… | 8 |
Author
Coffman, Don D. | 1 |
Dunbar, Stephen B. | 1 |
Feldt, Leonard S. | 1 |
Mehrens, William A. | 1 |
Millman, Jason | 1 |
Pastor, Dena A. | 1 |
Popham, W. James | 1 |
Taylor, Melinda Ann | 1 |
Thissen, David | 1 |
Vispoel, Walter P. | 1 |
Wainer, Howard | 1 |
More ▼ |
Publication Type
Journal Articles | 8 |
Reports - Evaluative | 6 |
Reports - Research | 2 |
Speeches/Meeting Papers | 1 |
Education Level
Elementary Education | 1 |
Elementary Secondary Education | 1 |
Grade 10 | 1 |
Grade 5 | 1 |
Grade 8 | 1 |
High Schools | 1 |
Intermediate Grades | 1 |
Junior High Schools | 1 |
Middle Schools | 1 |
Secondary Education | 1 |
Audience
Location
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Taylor, Melinda Ann; Pastor, Dena A. – Applied Measurement in Education, 2013
Although federal regulations require testing students with severe cognitive disabilities, there is little guidance regarding how technical quality should be established. It is known that challenges exist with documentation of the reliability of scores for alternate assessments. Typical measures of reliability do little in modeling multiple sources…
Descriptors: Generalizability Theory, Alternative Assessment, Test Reliability, Scores

Feldt, Leonard S. – Applied Measurement in Education, 1993
The recommendation that the reliability of multiple-choice tests will be enhanced if the distribution of item difficulties is concentrated at approximately 0.50 is reinforced and extended in this article by viewing the 0/1 item scoring as a dichotomization of an underlying normally distributed ability score. (SLD)
Descriptors: Ability, Difficulty Level, Guessing (Tests), Mathematical Models
Wise, Steven L. – Applied Measurement in Education, 2006
In low-stakes testing, the motivation levels of examinees are often a matter of concern to test givers because a lack of examinee effort represents a direct threat to the validity of the test data. This study investigated the use of response time to assess the amount of examinee effort received by individual test items. In 2 studies, it was found…
Descriptors: Computer Assisted Testing, Motivation, Test Validity, Item Response Theory

Wainer, Howard; Thissen, David – Applied Measurement in Education, 1993
Because assessment instruments of the future may well be composed of a combination of types of questions, a way to combine those scores effectively is discussed. Two new graphic tools are presented that show that it may not be practical to equalize the reliability of different components. (SLD)
Descriptors: Constructed Response, Educational Assessment, Graphs, Item Response Theory

Dunbar, Stephen B.; And Others – Applied Measurement in Education, 1991
Issues pertaining to the quality of performance assessments, including reliability and validity, are discussed. The relatively limited generalizability of performance across tasks is indicative of the care needed to evaluate performance assessments. Quality control is an empirical matter when measurement is intended to inform public policy. (SLD)
Descriptors: Educational Assessment, Generalization, Interrater Reliability, Measurement Techniques

Millman, Jason – Applied Measurement in Education, 1991
Alternatives to multiple-choice tests for teacher licensing examinations are described, and their advantages are cited. Concerns are expressed in the areas of cost and practicality, reliability, corruptibility, and validity. A suggestion for reducing costs using multiple-choice responses calibrated to constructed-response tasks is proposed. (SLD)
Descriptors: Beginning Teachers, Constructed Response, Cost Effectiveness, Educational Assessment

Vispoel, Walter P.; Coffman, Don D. – Applied Measurement in Education, 1994
Computerized-adaptive (CAT) and self-adapted (SAT) music listening tests were compared for efficiency, reliability, validity, and motivational benefits with 53 junior high school students. Results demonstrate trade-offs, with greater potential motivational benefits for SAT and greater efficiency for CAT. SAT elicited more favorable responses from…
Descriptors: Adaptive Testing, Computer Assisted Testing, Efficiency, Item Response Theory

Mehrens, William A.; Popham, W. James – Applied Measurement in Education, 1992
This paper discusses how to determine whether a test was developed in a legally defensible manner, reviewing general issues, specific cases bearing on different types of test use, some evaluative dimensions, and evidence of test quality. Tests constructed and used according to existing standards will generally stand legal scrutiny. (SLD)
Descriptors: College Entrance Examinations, Compliance (Legal), Constitutional Law, Court Litigation