ERIC - Search Results

Publication Date

In 2026	0
Since 2025	0
Since 2022 (last 5 years)	1
Since 2017 (last 10 years)	4
Since 2007 (last 20 years)	9

Source

Applied Measurement in…

Publication Type

Journal Articles	20
Reports - Research	12
Reports - Evaluative	7
Information Analyses	1

Education Level

Higher Education	2
Postsecondary Education	2
Grade 8	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	2
College Board Achievement…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 20 results Save | Export

New Tests of Rater Drift in Trend Scoring

Peer reviewed

Direct link

John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024

Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…

Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics

Equating with Small and Unbalanced Samples

Peer reviewed

Direct link

Goodman, Joshua T.; Dallas, Andrew D.; Fan, Fen – Applied Measurement in Education, 2020

Recent research has suggested that re-setting the standard for each administration of a small sample examination, in addition to the high cost, does not adequately maintain similar performance expectations year after year. Small-sample equating methods have shown promise with samples between 20 and 30. For groups that have fewer than 20 students,…

Descriptors: Equated Scores, Sample Size, Sampling, Weighted Scores

Impact of Item Parameter Drift on Rasch Scale Stability in Small Samples over Multiple Administrations

Peer reviewed

Direct link

Kopp, Jason P.; Jones, Andrew T. – Applied Measurement in Education, 2020

Traditional psychometric guidelines suggest that at least several hundred respondents are needed to obtain accurate parameter estimates under the Rasch model. However, recent research indicates that Rasch equating results in accurate parameter estimates with sample sizes as small as 25. Item parameter drift under the Rasch model has been…

Descriptors: Item Response Theory, Psychometrics, Sample Size, Sampling

Investigating Repeater Effects on Small Sample Equating: Include or Exclude?

Peer reviewed

Direct link

Diao, Hongyu; Keller, Lisa – Applied Measurement in Education, 2020

Examinees who attempt the same test multiple times are often referred to as "repeaters." Previous studies suggested that repeaters should be excluded from the total sample before equating because repeater groups are distinguishable from non-repeater groups. In addition, repeaters might memorize anchor items, causing item drift under a…

Descriptors: Licensing Examinations (Professions), College Entrance Examinations, Repetition, Testing Problems

Evaluating the Consistency of Angoff-Based Cut Scores Using Subsets of Items within a Generalizability Theory Framework

Peer reviewed

Direct link

Kannan, Priya; Sgammato, Adrienne; Tannenbaum, Richard J.; Katz, Irvin R. – Applied Measurement in Education, 2015

The Angoff method requires experts to view every item on the test and make a probability judgment. This can be time consuming when there are large numbers of items on the test. In this study, a G-theory framework was used to determine if a subset of items can be used to make generalizable cut-score recommendations. Angoff ratings (i.e.,…

Descriptors: Reliability, Standard Setting (Scoring), Cutting Scores, Test Items

Motivation Filtering on a Multi-Institution Assessment of General College Outcomes

Peer reviewed

Direct link

Steedle, Jeffrey T. – Applied Measurement in Education, 2014

Possible lack of motivation is a perpetual concern when tests have no stakes attached to performance. Specifically, the validity of test score interpretations may be compromised when examinees are unmotivated to exert their best efforts. Motivation filtering, a procedure that filters out apparently unmotivated examinees, was applied to the…

Descriptors: College Outcomes Assessment, Student Motivation, Sampling, Validity

Selection of Common Items as an Unrecognized Source of Variability in Test Equating: A Bootstrap Approximation Assuming Random Sampling of Common Items

Peer reviewed

Direct link

Michaelides, Michalis P.; Haertel, Edward H. – Applied Measurement in Education, 2014

The standard error of equating quantifies the variability in the estimation of an equating function. Because common items for deriving equated scores are treated as fixed, the only source of variability typically considered arises from the estimation of common-item parameters from responses of samples of examinees. Use of alternative, equally…

Descriptors: Equated Scores, Test Items, Sampling, Statistical Inference

Impact of Design Effects in Large-Scale District and State Assessments

Peer reviewed

Direct link

Phillips, Gary W. – Applied Measurement in Education, 2015

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…

Descriptors: State Programs, Sampling, Research Design, Error of Measurement

Determining the Anchor Composition for a Mixed-Format Test: Evaluation of Subpopulation Invariance of Linking Functions

Peer reviewed

Direct link

Kim, Sooyeon; Walker, Michael – Applied Measurement in Education, 2012

This study examined the appropriateness of the anchor composition in a mixed-format test, which includes both multiple-choice (MC) and constructed-response (CR) items, using subpopulation invariance indices. Linking functions were derived in the nonequivalent groups with anchor test (NEAT) design using two types of anchor sets: (a) MC only and (b)…

Descriptors: Multiple Choice Tests, Test Format, Test Items, Equated Scores

The Sampling Theory for the Intraclass Reliability Coefficient.

Peer reviewed

Feldt, Leonard S. – Applied Measurement in Education, 1990

Sampling theory for the intraclass reliability coefficient, a Spearman-Brown extrapolation of alpha to a single measurement for each examinee, is less recognized and less cited than that of coefficient alpha. Techniques for constructing confidence intervals and testing hypotheses for the intraclass coefficient are presented. (SLD)

Descriptors: Hypothesis Testing, Measurement Techniques, Reliability, Sampling

Effect on Equating Results of Matching Samples on an Anchor Test.

Peer reviewed

Lawrence, Ida M.; Dorans, Neil J. – Applied Measurement in Education, 1990

The sample invariant properties of five anchor test equating methods are addressed. Equating results across two sampling conditions--representative sampling and new-form matched sampling--are compared for Tucker and Levine equally reliable linear equating, item response theory true-score equating, and two equipercentile methods. (SLD)

Descriptors: Equated Scores, Item Response Theory, Sampling, Statistical Analysis

Does Matching in Equating Work? A Discussion.

Peer reviewed

Kolen, Michael J. – Applied Measurement in Education, 1990

Articles on equating test forms in this issue are reviewed and discussed. The results of these papers collectively indicate that matching on the anchor test does not result in more accurate equating. Implications for research are discussed. (SLD)

Descriptors: Equated Scores, Item Response Theory, Research Design, Sampling

To Match or Not to Match Samples on Ability for Equating: A Discussion of Five Articles.

Peer reviewed

Skaggs, Gary – Applied Measurement in Education, 1990

The articles in this issue that address the effect of matching samples on ability are reviewed. In spite of these examinations of equating methods and sampling plans, it is still hard to determine a definitive answer to the question of to match or not to match. Implications are discussed. (SLD)

Descriptors: Equated Scores, Item Response Theory, Research Methodology, Sampling

Simulation Results of Effects on Linear and Curvilinear Observed- and True-Score Equating Procedures of Matching on a Fallible Criterion.

Peer reviewed

Eignor, Daniel R.; And Others – Applied Measurement in Education, 1990

Two independent replications of a sequence of simulations were conducted to aid in the diagnosis and interpretation of equating differences found between representative (random) and matched (nonrandom) samples for three commonly used conventional observed-score equating procedures and one item-response-theory-based equating procedure. (SLD)

Descriptors: Equated Scores, Item Response Theory, Sampling, Simulation

Job Analysis and the Specification of Content for Licensure and Certification Examinations.

Peer reviewed

Raymond, Mark R. – Applied Measurement in Education, 2001

Reviews general approaches to job analysis and considers methodological issues related to sampling and the development of rating scales used to measure and describe a profession or occupation. Evaluates the usefulness of different types of test plans and describes judgmental and empirical methods for using practice analysis data to help develop…

Descriptors: Certification, Job Analysis, Licensing Examinations (Professions), Rating Scales

Previous Page | Next Page »

Pages: 1 | 2

Sampling	20
Equated Scores	12
Item Response Theory	11
Statistical Analysis	8
Error of Measurement	7
Reliability	5
Sample Size	5
Evaluation Methods	4
Licensing Examinations…	4
Test Items	4
Achievement Tests	3
College Entrance Examinations	3
Research Design	3
Scores	3
Simulation	3
Cutting Scores	2
Design	2
Generalizability Theory	2
Performance Based Assessment	2
Probability	2
State Programs	2
Statistical Inference	2
Test Validity	2
Testing Problems	2
Testing Programs	2
More ▼

Dorans, Neil J.	2
Gao, Xiaohong	2
Brennan, Robert L.	1
Carol Eckerly	1
Dallas, Andrew D.	1
Diao, Hongyu	1
Eignor, Daniel R.	1
Fan, Fen	1
Feldt, Leonard S.	1
Goodman, Joshua T.	1
Haertel, Edward H.	1
John R. Donoghue	1
Jones, Andrew T.	1
Kannan, Priya	1
Katz, Irvin R.	1
Keller, Lisa	1
Kim, Sooyeon	1
Kolen, Michael J.	1
Kopp, Jason P.	1
Lawrence, Ida M.	1
Livingston, Samuel A.	1
Michaelides, Michalis P.	1
Phillips, Gary W.	1
Raymond, Mark R.	1
More ▼