ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	10

Descriptor

Evaluation Methods	15
Test Items	6
Testing Programs	6
Computer Assisted Testing	4
Educational Assessment	4
Educational Testing	4
Equated Scores	4
Item Response Theory	4
Measurement Techniques	4
Comparative Analysis	3
Measurement	3
State Programs	3
Test Construction	3
Testing Problems	3
Adaptive Testing	2
College Students	2
Evaluation Criteria	2
Evaluation Research	2
Item Analysis	2
Multiple Choice Tests	2
Performance Based Assessment	2
Program Development	2
Scores	2
Scoring	2
Simulation	2
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	15
Reports - Research	9
Reports - Evaluative	5
Information Analyses	1
Reports - Descriptive	1

Education Level

Higher Education	2
Elementary Secondary Education	1
Grade 11	1
High Schools	1
Postsecondary Education	1
Secondary Education	1

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment

What Works Clearinghouse Rating

Showing all 15 results Save | Export

Traditional vs Intersectional DIF Analysis: Considerations and a Comparison Using State Testing Data

Peer reviewed

Direct link

Tony Albano; Brian F. French; Thao Thu Vo – Applied Measurement in Education, 2024

Recent research has demonstrated an intersectional approach to the study of differential item functioning (DIF). This approach expands DIF to account for the interactions between what have traditionally been treated as separate grouping variables. In this paper, we compare traditional and intersectional DIF analyses using data from a state testing…

Descriptors: Test Items, Item Analysis, Data Use, Standardized Tests

Impact of Design Effects in Large-Scale District and State Assessments

Peer reviewed

Direct link

Phillips, Gary W. – Applied Measurement in Education, 2015

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…

Descriptors: State Programs, Sampling, Research Design, Error of Measurement

The Comparability of Scores from Different Digital Devices: A Literature Review and Synthesis with Recommendations for Practice

Peer reviewed

Direct link

Dadey, Nathan; Lyons, Susan; DePascale, Charles – Applied Measurement in Education, 2018

Evidence of comparability is generally needed whenever there are variations in the conditions of an assessment administration, including variations introduced by the administration of an assessment on multiple digital devices (e.g., tablet, laptop, desktop). This article is meant to provide a comprehensive examination of issues relevant to the…

Descriptors: Evaluation Methods, Computer Assisted Testing, Educational Technology, Technology Uses in Education

Practical Application of a Synthetic Linking Function on Small-Sample Equating

Peer reviewed

Direct link

Kim, Sooyeon; von Davier, Alina A.; Haberman, Shelby – Applied Measurement in Education, 2011

The synthetic function is a weighted average of the identity (the linking function for forms that are known to be completely parallel) and a traditional equating method. The purpose of the present study was to investigate the benefits of the synthetic function on small-sample equating using various real data sets gathered from different…

Descriptors: Testing Programs, Equated Scores, Investigations, Data Analysis

A Case of Inconsistent Equatings: How the Man with Four Watches Decides What Time It Is

Peer reviewed

Direct link

Livingston, Samuel A.; Antal, Judit – Applied Measurement in Education, 2010

A simultaneous equating of four new test forms to each other and to one previous form was accomplished through a complex design incorporating seven separate equating links. Each new form was linked to the reference form by four different paths, and each path produced a different score conversion. The procedure used to resolve these inconsistencies…

Descriptors: Measurement Techniques, Measurement, Educational Assessment, Educational Testing

Detecting and Correcting Scale Drift in Test Equating: An Illustration from a Large Scale Testing Program

Peer reviewed

Direct link

Puhan, Gautam – Applied Measurement in Education, 2009

The purpose of this study is to determine the extent of scale drift on a test that employs cut scores. It was essential to examine scale drift for this testing program because new forms in this testing program are often put on scale through a series of intermediate equatings (known as equating chains). This process may cause equating error to…

Descriptors: Testing Programs, Testing, Measurement Techniques, Item Response Theory

Two Approaches for Identifying Low-Motivated Students in a Low-Stakes Assessment Context

Peer reviewed

Direct link

Swerdzewski, Peter J.; Harmes, J. Christine; Finney, Sara J. – Applied Measurement in Education, 2011

Many universities rely on data gathered from tests that are low stakes for examinees but high stakes for the various programs being assessed. Given the lack of consequences associated with many collegiate assessments, the construct-irrelevant variance introduced by unmotivated students is potentially a serious threat to the validity of the…

Descriptors: Computer Assisted Testing, Student Motivation, Inferences, Universities

A Comparison of IRT Linking Procedures

Peer reviewed

Direct link

Lee, Won-Chan; Ban, Jae-Chun – Applied Measurement in Education, 2010

Various applications of item response theory often require linking to achieve a common scale for item parameter estimates obtained from different groups. This article used a simulation to examine the relative performance of four different item response theory (IRT) linking procedures in a random groups equating design: concurrent calibration with…

Descriptors: Item Response Theory, Simulation, Comparative Analysis, Measurement Techniques

An Approach for Categorizing DIF in Polytomous Items

Peer reviewed

Direct link

Penfield, Randall D. – Applied Measurement in Education, 2007

A widely used approach for categorizing the level of differential item functioning (DIF) in dichotomous items is the scheme proposed by Educational Testing Service (ETS) based on a transformation of the Mantel-Haeszel common odds ratio. In this article two classification schemes for DIF in polytomous items (referred to as the P1 and P2 schemes)…

Descriptors: Simulation, Educational Testing, Test Bias, Evaluation Methods

Applying Bayesian Item Selection Approaches to Adaptive Tests Using Polytomous Items

Peer reviewed

Direct link

Penfield, Randall D. – Applied Measurement in Education, 2006

This study applied the maximum expected information (MEI) and the maximum posterior-weighted information (MPI) approaches of computer adaptive testing item selection to the case of a test using polytomous items following the partial credit model. The MEI and MPI approaches are described. A simulation study compared the efficiency of ability…

Descriptors: Bayesian Statistics, Adaptive Testing, Computer Assisted Testing, Test Items

Student Test Score Reports and Interpretive Guides: Review of Current Practices and Suggestions for Future Research

Peer reviewed

Direct link

Goodman, Dean P.; Hambleton, Ronald K. – Applied Measurement in Education, 2004

A critical, but often neglected, component of any large-scale assessment program is the reporting of test results. In the past decade, a body of evidence has been compiled that raises concerns over the ways in which these results are reported to and understood by their intended audiences. In this study, current approaches for reporting…

Descriptors: Test Results, Student Evaluation, Scores, Testing Programs

Examining the Costs of Performance Assessment.

Peer reviewed

Hardy, Roy A. – Applied Measurement in Education, 1995

Cost factors associated with the development, administration, and scoring of performance assessment tasks are examined in the context of a statewide or other large-scale assessment program. Resources of money, time, and expertise are discussed. (SLD)

Descriptors: Cost Estimates, Costs, Educational Assessment, Estimation (Mathematics)

The Validity of Normative Data Provided for Customized Tests: Two Perspectives.

Peer reviewed

Forsyth, Robert A.; And Others – Applied Measurement in Education, 1992

Two criteria defined in previous research that can be used to evaluate the validity of normative data provided for customized tests are discussed. Results of an exploratory investigation of the validity of such data for about 2,500 fifth graders in a 1989 study are reported. (SLD)

Descriptors: Adaptive Testing, Elementary School Students, Evaluation Criteria, Evaluation Methods

Statistical Detection of Multiple-Choice Answer Copying: Review and Commentary.

Peer reviewed

Frary, Robert B. – Applied Measurement in Education, 1993

Methods for detecting copying of multiple-choice test responses are reviewed and compared with respect to their effectiveness and the practicality of their application for groups of varying sizes. Reasons why effective detection methods are seldom applied in standardized and classroom testing are discussed. (SLD)

Descriptors: Cheating, Elementary Secondary Education, Evaluation Methods, Higher Education

Performance Assessment: State Activity, Interest, and Concerns.

Peer reviewed

Aschbacher, Pamela R. – Applied Measurement in Education, 1991

The University of California's (Los Angeles) Center for Research on Evaluation, Standards, and Student Testing survey of state assessment directors reveals that about 25 states currently study or develop performance assessments. Obstacles to statewide use of performance assessments were expressed. The new Student Assessment Exchange should…

Descriptors: Accountability, Cost Effectiveness, Educational Assessment, Educational Improvement

Penfield, Randall D.	2
Antal, Judit	1
Aschbacher, Pamela R.	1
Ban, Jae-Chun	1
Brian F. French	1
Dadey, Nathan	1
DePascale, Charles	1
Finney, Sara J.	1
Forsyth, Robert A.	1
Frary, Robert B.	1
Goodman, Dean P.	1
Haberman, Shelby	1
Hambleton, Ronald K.	1
Hardy, Roy A.	1
Harmes, J. Christine	1
Kim, Sooyeon	1
Lee, Won-Chan	1
Livingston, Samuel A.	1
Lyons, Susan	1
Phillips, Gary W.	1
Puhan, Gautam	1
Swerdzewski, Peter J.	1
Thao Thu Vo	1
Tony Albano	1
More ▼