ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	5

Source

Applied Measurement in…

Author

Wainer, Howard	2
Almehrizi, Rashid S.	1
Dunbar, Stephen B.	1
Fisher, Steve	1
Godfrey, Alan T. K.	1
Holland, Paul W.	1
Hou, Liling	1
Johnson, Robert L.	1
Klein, Stephen P.	1
Kuhs, Therese	1
Pastor, Dena A.	1
Penny, Jim	1
Slepkov, Aaron D.	1
Sykes, Robert C.	1
Taylor, Melinda Ann	1
Thissen, David	1
Wise, Lauress L.	1
Wise, Steven L.	1
More ▼

Publication Type

Journal Articles	11
Reports - Research	6
Reports - Evaluative	4
Reports - Descriptive	1
Speeches/Meeting Papers	1

Education Level

Grade 8	2
Elementary Education	1
Elementary Secondary Education	1
Grade 10	1
Grade 4	1
Grade 5	1
High Schools	1
Higher Education	1
Intermediate Grades	1
Junior High Schools	1
Middle Schools	1
Postsecondary Education	1
Secondary Education	1
More ▼

Audience

Location

Oman	1
Vermont	1

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)

What Works Clearinghouse Rating

Showing all 11 results Save | Export

Coefficient [beta] as Extension of KR-21 Reliability for Summed and Scaled Scores for Polytomously-Scored Tests

Peer reviewed

Direct link

Almehrizi, Rashid S. – Applied Measurement in Education, 2021

KR-21 reliability and its extension (coefficient [alpha]) gives the reliability estimate of test scores under the assumption of tau-equivalent forms. KR-21 reliability gives the reliability estimate for summed scores for dichotomous items when items are randomly sampled from an infinite pool of similar items (randomly parallel forms). The article…

Descriptors: Test Reliability, Scores, Scoring, Computation

Partial Credit in Answer-Until-Correct Multiple-Choice Tests Deployed in a Classroom Setting

Peer reviewed

Direct link

Slepkov, Aaron D.; Godfrey, Alan T. K. – Applied Measurement in Education, 2019

The answer-until-correct (AUC) method of multiple-choice (MC) testing involves test respondents making selections until the keyed answer is identified. Despite attendant benefits that include improved learning, broad student adoption, and facile administration of partial credit, the use of AUC methods for classroom testing has been extremely…

Descriptors: Multiple Choice Tests, Test Items, Test Reliability, Scores

An Application of Generalizability Theory to Evaluate the Technical Quality of an Alternate Assessment

Peer reviewed

Direct link

Taylor, Melinda Ann; Pastor, Dena A. – Applied Measurement in Education, 2013

Although federal regulations require testing students with severe cognitive disabilities, there is little guidance regarding how technical quality should be established. It is known that challenges exist with documentation of the reliability of scores for alternate assessments. Typical measures of reliability do little in modeling multiple sources…

Descriptors: Generalizability Theory, Alternative Assessment, Test Reliability, Scores

Accessible Reading Assessments for Students with Disabilities: Summary and Conclusions

Peer reviewed

Direct link

Wise, Lauress L. – Applied Measurement in Education, 2010

The articles in this special issue make two important contributions to our understanding of the impact of accommodations on test score validity. First, they illustrate a variety of methods for collection and rigorous analyses of empirical data that can supplant expert judgment of the impact of accommodations. These methods range from internal…

Descriptors: Reading Achievement, Educational Assessment, Test Reliability, Learning Disabilities

Score Resolution: An Investigation of the Reliability and Validity of Resolved Scores

Peer reviewed

Direct link

Johnson, Robert L.; Penny, Jim; Fisher, Steve; Kuhs, Therese – Applied Measurement in Education, 2003

When raters assign different scores to a performance task, a method for resolving rating differences is required to report a single score to the examinee. Recent studies indicate that decisions about examinees, such as pass/fail decisions, differ across resolution methods. Previous studies also investigated the interrater reliability of…

Descriptors: Test Reliability, Test Validity, Scores, Interrater Reliability

Weighting Constructed-Response Items in IRT-Based Exams

Peer reviewed

Direct link

Sykes, Robert C.; Hou, Liling – Applied Measurement in Education, 2003

Weighting responses to Constructed-Response (CR) items has been proposed as a way to increase the contribution these items make to the test score when there is insufficient testing time to administer additional CR items. The effect of various types of weighting items of an IRT-based mixed-format writing examination was investigated.…

Descriptors: Item Response Theory, Weighted Scores, Responses, Scores

An Investigation of the Differential Effort Received by Items on a Low-Stakes Computer-Based Test

Peer reviewed

Direct link

Wise, Steven L. – Applied Measurement in Education, 2006

In low-stakes testing, the motivation levels of examinees are often a matter of concern to test givers because a lack of examinee effort represents a direct threat to the validity of the test data. This study investigated the use of response time to assess the amount of examinee effort received by individual test items. In 2 studies, it was found…

Descriptors: Computer Assisted Testing, Motivation, Test Validity, Item Response Theory

Combining Multiple-Choice and Constructed-Response Test Scores: Toward a Marxist Theory of Test Construction.

Peer reviewed

Wainer, Howard; Thissen, David – Applied Measurement in Education, 1993

Because assessment instruments of the future may well be composed of a combination of types of questions, a way to combine those scores effectively is discussed. Two new graphic tools are presented that show that it may not be practical to equalize the reliability of different components. (SLD)

Descriptors: Constructed Response, Educational Assessment, Graphs, Item Response Theory

Sources of Uncertainty Often Ignored in Adjusting State Mean SAT Scores for Differential Participation Rates: The Rules of the Game.

Peer reviewed

Holland, Paul W.; Wainer, Howard – Applied Measurement in Education, 1990

Two attempts to adjust state mean Scholastic Aptitude Test (SAT) scores for differential participation rates are examined. Both attempts are rejected, and five rules for performing adjustments are outlined to foster follow-up checks on untested assumptions. National Assessment of Educational Progress state data are determined to be more accurate.…

Descriptors: College Applicants, College Entrance Examinations, Estimation (Mathematics), Item Bias

The Reliability of Mathematics Portfolio Scores: Lessons from the Vermont Experience.

Peer reviewed

Klein, Stephen P.; And Others – Applied Measurement in Education, 1995

Portfolios are the centerpiece of Vermont's statewide assessment program in mathematics. Portfolio scores in the first two years were not reliable enough to permit the reporting of student-level results, but increasing the number of readers or the number of portfolio pieces is not operationally feasible. (SLD)

Descriptors: Educational Assessment, Elementary Secondary Education, Mathematics Tests, Performance Based Assessment

Quality Control in the Development and Use of Performance Assessments.

Peer reviewed

Dunbar, Stephen B.; And Others – Applied Measurement in Education, 1991

Issues pertaining to the quality of performance assessments, including reliability and validity, are discussed. The relatively limited generalizability of performance across tasks is indicative of the care needed to evaluate performance assessments. Quality control is an empirical matter when measurement is intended to inform public policy. (SLD)

Descriptors: Educational Assessment, Generalization, Interrater Reliability, Measurement Techniques

Scores	11
Test Reliability	11
Educational Assessment	4
Scoring	4
Test Construction	4
Item Response Theory	3
Multiple Choice Tests	3
Test Items	3
Test Validity	3
Error of Measurement	2
Interrater Reliability	2
Mathematics Tests	2
Measurement Techniques	2
Performance Based Assessment	2
State Programs	2
Student Evaluation	2
Test Use	2
Testing Programs	2
Academic Accommodations…	1
Alternative Assessment	1
Anxiety	1
Beginning Teachers	1
College Applicants	1
College Entrance Examinations	1
College Students	1
More ▼