ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	10

Descriptor

Models	14
Test Validity	8
Validity	5
Evaluation Methods	4
Scores	4
Test Construction	4
Test Items	4
Decision Making	3
Evidence	3
Inferences	3
Item Response Theory	3
Measurement	3
Measurement Techniques	3
Scoring	3
Achievement Tests	2
Advanced Placement Programs	2
Comparative Analysis	2
Construct Validity	2
Equivalency Tests	2
Knowledge Level	2
Licensing Examinations…	2
Statistical Analysis	2
Test Interpretation	2
Test Use	2
Accuracy	1
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	14
Reports - Research	6
Reports - Evaluative	5
Reports - Descriptive	3
Speeches/Meeting Papers	1

Education Level

Secondary Education	4
High Schools	3
Middle Schools	2
Grade 7	1
Higher Education	1
Junior High Schools	1

Audience

Location

Arizona	1
France	1
Jordan	1
Massachusetts	1
North Carolina	1
United Kingdom	1
Virginia	1

Laws, Policies, & Programs

Assessments and Surveys

Program for International…	1
Texas Assessment of Academic…	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Comparison of Methods for Identifying Differential Step Functioning with Polytomous Item Response Data

Peer reviewed

Direct link

Finch, Holmes – Applied Measurement in Education, 2022

Much research has been devoted to identification of differential item functioning (DIF), which occurs when the item responses for individuals from two groups differ after they are conditioned on the latent trait being measured by the scale. There has been less work examining differential step functioning (DSF), which is present for polytomous…

Descriptors: Comparative Analysis, Item Response Theory, Item Analysis, Simulation

Characterizing the Latent Classes in a Mixture IRT Model Using DIF

Peer reviewed

Direct link

Karadavut, Tugba – Applied Measurement in Education, 2021

Mixture IRT models address the heterogeneity in a population by extracting latent classes and allowing item parameters to vary between latent classes. Once the latent classes are extracted, they need to be further examined to be characterized. Some approaches have been adopted in the literature for this purpose. These approaches examine either the…

Descriptors: Item Response Theory, Models, Test Items, Maximum Likelihood Statistics

The Trade-Off between Model Fit, Invariance, and Validity: The Case of PISA Science Assessments

Peer reviewed

Direct link

El Masri, Yasmine H.; Andrich, David – Applied Measurement in Education, 2020

In large-scale educational assessments, it is generally required that tests are composed of items that function invariantly across the groups to be compared. Despite efforts to ensure invariance in the item construction phase, for a range of reasons (including the security of items) it is often necessary to account for differential item…

Descriptors: Models, Goodness of Fit, Test Validity, Achievement Tests

Integrating Validation Arguments with the Assessment Triangle: A Framework for Operationalizing and Instantiating Validation

Peer reviewed

Direct link

Ketterlin-Geller, Leanne R.; Perry, Lindsey; Adams, Elizabeth – Applied Measurement in Education, 2019

Despite the call for an argument-based approach to validity over 25 years ago, few examples exist in the published literature. One possible explanation for this outcome is that the complexity of the argument-based approach makes implementation difficult. To counter this claim, we propose that the Assessment Triangle can serve as the overarching…

Descriptors: Validity, Educational Assessment, Models, Screening Tests

Appraising the Scoring Performance of Automated Essay Scoring Systems--Some Additional Considerations: Which Essays? Which Human Raters? Which Scores?

Peer reviewed

Direct link

Raczynski, Kevin; Cohen, Allan – Applied Measurement in Education, 2018

The literature on Automated Essay Scoring (AES) systems has provided useful validation frameworks for any assessment that includes AES scoring. Furthermore, evidence for the scoring fidelity of AES systems is accumulating. Yet questions remain when appraising the scoring performance of AES systems. These questions include: (a) which essays are…

Descriptors: Essay Tests, Test Scoring Machines, Test Validity, Evaluators

In Search of Validity Evidence in Support of the Interpretation and Use of Assessments of Complex Constructs: Discussion of Research on Assessing 21st Century Skills

Peer reviewed

Direct link

Ercikan, Kadriye; Oliveri, María Elena – Applied Measurement in Education, 2016

Assessing complex constructs such as those discussed under the umbrella of 21st century constructs highlights the need for a principled assessment design and validation approach. In our discussion, we made a case for three considerations: (a) taking construct complexity into account across various stages of assessment development such as the…

Descriptors: Evaluation Methods, Test Construction, Design, Scaling

Validating Measurement of Knowledge Integration in Science Using Multiple-Choice and Explanation Items

Peer reviewed

Direct link

Lee, Hee-Sun; Liu, Ou Lydia; Linn, Marcia C. – Applied Measurement in Education, 2011

This study explores measurement of a construct called knowledge integration in science using multiple-choice and explanation items. We use construct and instructional validity evidence to examine the role multiple-choice and explanation items plays in measuring students' knowledge integration ability. For construct validity, we analyze item…

Descriptors: Knowledge Level, Construct Validity, Validity, Scaffolding (Teaching Technique)

The Promises and Challenges of Implementing Evidence-Centered Design in Large-Scale Assessment

Peer reviewed

Direct link

Huff, Kristen; Steinberg, Linda; Matts, Thomas – Applied Measurement in Education, 2010

The cornerstone of evidence-centered assessment design (ECD) is an evidentiary argument that requires that each target of measurement (e.g., learning goal) for an assessment be expressed as a "claim" to be made about an examinee that is relevant to the specific purpose and audience(s) for the assessment. The "observable evidence" required to…

Descriptors: Advanced Placement Programs, Equivalency Tests, Evidence, Test Construction

Claims, Evidence, and Achievement-Level Descriptors as a Foundation for Item Design and Test Specifications

Peer reviewed

Direct link

Hendrickson, Amy; Huff, Kristen; Luecht, Richard – Applied Measurement in Education, 2010

Evidence-centered assessment design (ECD) explicates a transparent evidentiary argument to warrant the inferences we make from student test performance. This article describes how the vehicles for gathering student evidence--task models and test specifications--are developed. Task models, which are the basis for item development, flow directly…

Descriptors: Evidence, Test Construction, Measurement, Classification

Indicators of Usefulness of Test Scores

Peer reviewed

Direct link

Sawyer, Richard – Applied Measurement in Education, 2007

Current thinking on validity suggests that educational institutions and individuals should evaluate their uses of test scores in the context of their fundamental goals. Regression coefficients and other traditional criterion-related validity statistics provide relevant information, but often do not, by themselves, address the fundamental reasons…

Descriptors: College Admission, Regression (Statistics), Test Validity, Scores

The Texas Model for Content and Curricular Validity.

Peer reviewed

Smisko, Ann; Twing, Jon S.; Denny, Patricia – Applied Measurement in Education, 2000

Describes the Texas test development process in detail, showing how each test development step is linked to the "Standards for Educational and Psychological Testing." The routine use of this process provides evidence of the content and curricular validity of the Texas Assessment of Academic Skills. (SLD)

Descriptors: Achievement Tests, Curriculum, Models, Test Construction

Using Multidimensional Item Response Theory to Understand What Items and Tests Are Measuring.

Peer reviewed

Ackerman, Terry A. – Applied Measurement in Education, 1994

When item response data do not satisfy the unidimensionality assumption, multidimensional item response theory (MIRT) should be used to model the item-examinee interaction. This article presents and discusses MIRT analyses designed to give better insight into what individual items are measuring. (SLD)

Descriptors: Evaluation Methods, Item Response Theory, Measurement Techniques, Models

Using College GPA and Test Scores in Teacher Licensure Decisions: Conjunctive versus Compensatory Models.

Peer reviewed

Mehrens, William A.; Phillips, S. E. – Applied Measurement in Education, 1989

A sequential decision-making approach based on college grade point averages and test scores for teacher licensure decisions within the conjunctive model is contrasted with the compensatory model for decision making. Criteria for choosing one model over another and a rationale for favoring the conjunctive model are provided. (TJH)

Descriptors: Comparative Analysis, Cutting Scores, Decision Making, Grade Point Average

Reliability of Credentialing Examinations and the Impact of Scoring Models and Standard-Setting Policies.

Peer reviewed

Hambleton, Ronald K.; Slater, Sharon C. – Applied Measurement in Education, 1997

A brief history of developments in the assessment of the reliability of credentialing examinations is presented, and some new results are outlined that highlight the interactions among scoring, standard setting, and the reliability and validity of pass-fail decisions. Decision consistency is an important concept in evaluating credentialing…

Descriptors: Certification, Credentials, Decision Making, Interaction

Huff, Kristen	2
Ackerman, Terry A.	1
Adams, Elizabeth	1
Andrich, David	1
Cohen, Allan	1
Denny, Patricia	1
El Masri, Yasmine H.	1
Ercikan, Kadriye	1
Finch, Holmes	1
Hambleton, Ronald K.	1
Hendrickson, Amy	1
Karadavut, Tugba	1
Ketterlin-Geller, Leanne R.	1
Lee, Hee-Sun	1
Linn, Marcia C.	1
Liu, Ou Lydia	1
Luecht, Richard	1
Matts, Thomas	1
Mehrens, William A.	1
Oliveri, María Elena	1
Perry, Lindsey	1
Phillips, S. E.	1
Raczynski, Kevin	1
Sawyer, Richard	1
Slater, Sharon C.	1
More ▼