ERIC - Search Results

Publication Date

In 2025	0
Since 2024	2
Since 2021 (last 5 years)	5
Since 2016 (last 10 years)	19
Since 2006 (last 20 years)	42

Descriptor

Evaluation Methods	91
Item Response Theory	23
Student Evaluation	19
Test Items	19
Performance Based Assessment	18
Test Construction	16
Educational Assessment	15
Measurement Techniques	14
Simulation	13
Scores	11
Decision Making	10
Elementary Secondary Education	10
Equated Scores	10
Standards	10
Statistical Analysis	10
Evaluators	9
Academic Achievement	8
Comparative Analysis	8
Scoring	8
Standard Setting (Scoring)	8
Test Bias	8
Test Validity	8
Validity	8
Interrater Reliability	7
Test Use	7
More ▼

Source

Applied Measurement in…

Publication Type

Journal Articles	91
Reports - Research	44
Reports - Evaluative	37
Information Analyses	9
Reports - Descriptive	7
Speeches/Meeting Papers	2
Guides - Non-Classroom	1
Opinion Papers	1
Reports - General	1

Education Level

Elementary Secondary Education	5
High Schools	4
Higher Education	4
Elementary Education	2
Middle Schools	2
Postsecondary Education	2
Secondary Education	2
Grade 10	1
Grade 11	1
Grade 12	1
Grade 7	1
Grade 8	1
Junior High Schools	1
More ▼

Audience

Researchers

Location

Arizona	2
Australia	1
Florida	1
Germany	1
Kentucky	1
Maine	1
Massachusetts	1
New Hampshire	1
North Carolina	1
Turkey	1
United States	1
Virginia	1
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

SAT (College Admission Test)	2
ACT Assessment	1
Florida Comprehensive…	1
Self Description Questionnaire	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 91 results Save | Export

Combining Nonparametric and Parametric Item Response Theory to Explore Data Quality: Illustrations and a Simulation Study

Peer reviewed

Direct link

Stefanie A. Wind; Benjamin Lugu – Applied Measurement in Education, 2024

Researchers who use measurement models for evaluation purposes often select models with stringent requirements, such as Rasch models, which are parametric. Mokken Scale Analysis (MSA) offers a theory-driven nonparametric modeling approach that may be more appropriate for some measurement applications. Researchers have discussed using MSA as a…

Descriptors: Item Response Theory, Data Analysis, Simulation, Nonparametric Statistics

Enacting a Process for Developing Culturally Relevant Classroom Assessments

Peer reviewed

Direct link

O'Dwyer, Eowyn P.; Sparks, Jesse R.; Nabors Oláh, Leslie – Applied Measurement in Education, 2023

A critical aspect of the development of culturally relevant classroom assessments is the design of tasks that affirm students' racial and ethnic identities and community cultural practices. This paper describes the process we followed to build a shared understanding of what culturally relevant assessments are, to pursue ways of bringing more…

Descriptors: Evaluation Methods, Culturally Relevant Education, Test Construction, Educational Research

Analyzing Student Response Processes to Evaluate Success on a Technology-Based Problem-Solving Task

Peer reviewed

Direct link

Han, Yuting; Wilson, Mark – Applied Measurement in Education, 2022

A technology-based problem-solving test can automatically capture all the actions of students when they complete tasks and save them as process data. Response sequences are the external manifestations of the latent intellectual activities of the students, and it contains rich information about students' abilities and different problem-solving…

Descriptors: Technology Uses in Education, Problem Solving, 21st Century Skills, Evaluation Methods

Traditional vs Intersectional DIF Analysis: Considerations and a Comparison Using State Testing Data

Peer reviewed

Direct link

Tony Albano; Brian F. French; Thao Thu Vo – Applied Measurement in Education, 2024

Recent research has demonstrated an intersectional approach to the study of differential item functioning (DIF). This approach expands DIF to account for the interactions between what have traditionally been treated as separate grouping variables. In this paper, we compare traditional and intersectional DIF analyses using data from a state testing…

Descriptors: Test Items, Item Analysis, Data Use, Standardized Tests

Comparing Cut Scores from the Angoff Method and Two Variations of the Hofstee and Beuk Methods

Peer reviewed

Direct link

Wyse, Adam E. – Applied Measurement in Education, 2020

This article compares cut scores from two variations of the Hofstee and Beuk methods, which determine cut scores by resolving inconsistencies in panelists' judgments about cut scores and pass rates, with the Angoff method. The first variation uses responses to the Hofstee and Beuk percentage correct and pass rate questions to calculate cut scores.…

Descriptors: Cutting Scores, Evaluation Methods, Standard Setting (Scoring), Equations (Mathematics)

Keeping up the Pace: Evaluating Grade 8 Student Achievement Outcomes for New Hampshire's Innovative Assessment System

Peer reviewed

Direct link

Perez, Alexandra Lane; Evans, Carla – Applied Measurement in Education, 2023

New Hampshire's Performance Assessment of Competency Education (PACE) innovative assessment system uses student scores from classroom performance assessments as well as other classroom tests for school accountability purposes. One concern is that not having annual state testing may incentivize schools and teachers away from teaching the breadth of…

Descriptors: Grade 8, Competency Based Education, Evaluation Methods, Educational Innovation

Validating Rubric Scoring Processes: An Application of an Item Response Tree Model

Peer reviewed

Direct link

Myers, Aaron J.; Ames, Allison J.; Leventhal, Brian C.; Holzman, Madison A. – Applied Measurement in Education, 2020

When rating performance assessments, raters may ascribe different scores for the same performance when rubric application does not align with the intended application of the scoring criteria. Given performance assessment score interpretation assumes raters apply rubrics as rubric developers intended, misalignment between raters' scoring processes…

Descriptors: Scoring Rubrics, Validity, Item Response Theory, Interrater Reliability

Some Methods and Evaluation for Linking and Equating with Small Samples

Peer reviewed

Direct link

Peabody, Michael R. – Applied Measurement in Education, 2020

The purpose of the current article is to introduce the equating and evaluation methods used in this special issue. Although a comprehensive review of all existing models and methodologies would be impractical given the format, a brief introduction to some of the more popular models will be provided. A brief discussion of the conditions required…

Descriptors: Evaluation Methods, Equated Scores, Sample Size, Item Response Theory

Investigating the Classification Accuracy of Rasch and Nominal Weights Mean Equating with Very Small Samples

Peer reviewed

Direct link

Furter, Robert T.; Dwyer, Andrew C. – Applied Measurement in Education, 2020

Maintaining equivalent performance standards across forms is a psychometric challenge exacerbated by small samples. In this study, the accuracy of two equating methods (Rasch anchored calibration and nominal weights mean) and four anchor item selection methods were investigated in the context of very small samples (N = 10). Overall, nominal…

Descriptors: Classification, Accuracy, Item Response Theory, Equated Scores

Equating with Small and Unbalanced Samples

Peer reviewed

Direct link

Goodman, Joshua T.; Dallas, Andrew D.; Fan, Fen – Applied Measurement in Education, 2020

Recent research has suggested that re-setting the standard for each administration of a small sample examination, in addition to the high cost, does not adequately maintain similar performance expectations year after year. Small-sample equating methods have shown promise with samples between 20 and 30. For groups that have fewer than 20 students,…

Descriptors: Equated Scores, Sample Size, Sampling, Weighted Scores

Identifying Disengaged Survey Responses: New Evidence Using Response Time Metadata

Peer reviewed

Direct link

Soland, James; Wise, Steven L.; Gao, Lingyun – Applied Measurement in Education, 2019

Disengaged responding is a phenomenon that often biases observed scores from achievement tests and surveys in practically and statistically significant ways. This problem has led to the development of methods to detect and correct for disengaged responses on both achievement test and survey scores. One major disadvantage when trying to detect…

Descriptors: Reaction Time, Metadata, Response Style (Tests), Student Surveys

Erasure Analyses: Reducing the Number of False Positives

Peer reviewed

Direct link

McClintock, Joseph Clair – Applied Measurement in Education, 2015

Erasure analysis is the study of the pattern or quantity of erasures on multiple-choice paper-and-pencil examinations, to determine whether erasures were made post-testing for the purpose of unfairly increasing students' scores. This study examined the erasure data from over 1.4 million exams, taken by more than 600,000 students. Three…

Descriptors: Multiple Choice Tests, Cheating, Methods, Computation

Where Are We Now? Learning Progressions and Formative Assessment

Peer reviewed

Direct link

Gotwals, Amelia Wenk – Applied Measurement in Education, 2018

In this commentary, I consider the three empirical studies in this special issue based on two main aspects: (a) the nature of the learning progressions and (b) what formative assessment practice(s) were investigated. Specifically, I describe differences among the learning progressions in terms of scope and grain size. I also identify three…

Descriptors: Skill Development, Behavioral Objectives, Formative Evaluation, Evaluation Methods

An Argument for Formative Assessment with Science Learning Progressions

Peer reviewed

Direct link

Alonzo, Alicia C. – Applied Measurement in Education, 2018

Learning progressions--particularly as defined and operationalized in science education--have significant potential to inform teachers' formative assessment practices. In this overview article, I lay out an argument for this potential, starting from definitions for "formative assessment practices" and "learning progressions"…

Descriptors: Skill Development, Behavioral Objectives, Science Education, Formative Evaluation

Appraising the Scoring Performance of Automated Essay Scoring Systems--Some Additional Considerations: Which Essays? Which Human Raters? Which Scores?

Peer reviewed

Direct link

Raczynski, Kevin; Cohen, Allan – Applied Measurement in Education, 2018

The literature on Automated Essay Scoring (AES) systems has provided useful validation frameworks for any assessment that includes AES scoring. Furthermore, evidence for the scoring fidelity of AES systems is accumulating. Yet questions remain when appraising the scoring performance of AES systems. These questions include: (a) which essays are…

Descriptors: Essay Tests, Test Scoring Machines, Test Validity, Evaluators

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7

Penfield, Randall D.	3
Plake, Barbara S.	3
Byrne, Barbara M.	2
Finch, Holmes	2
Hambleton, Ronald K.	2
Livingston, Samuel A.	2
Monahan, Patrick	2
Popham, W. James	2
Sireci, Stephen G.	2
Soland, James	2
Su, Ya-Hui	2
Wang, Wen-Chung	2
Wells, Craig S.	2
Wilson, Mark	2
Ackerman, Terry A.	1
Alonzo, Alicia C.	1
Ames, Allison J.	1
Antal, Judit	1
Aschbacher, Pamela R.	1
Ayala, Carlos C.	1
Baghaei, Purya	1
Ban, Jae-Chun	1
Baron, Joan Boykoff	1
Bart, William M.	1
More ▼