ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	17

Source

Applied Measurement in…

Publication Type

Journal Articles	49
Reports - Evaluative	49
Information Analyses	5
Guides - Non-Classroom	1
Speeches/Meeting Papers	1

Education Level

High Schools	3
Grade 5	2
Higher Education	2
Middle Schools	2
Secondary Education	2
Adult Education	1
Elementary Education	1
Grade 3	1
Grade 4	1
Grade 6	1
Grade 7	1
Grade 8	1
More ▼

Audience

Location

Arizona	1
Kansas	1
Massachusetts	1
North Carolina	1
United States	1
Virginia	1

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	1
Self Description Questionnaire	1

What Works Clearinghouse Rating

Showing 1 to 15 of 49 results Save | Export

Some Methods and Evaluation for Linking and Equating with Small Samples

Peer reviewed

Direct link

Peabody, Michael R. – Applied Measurement in Education, 2020

The purpose of the current article is to introduce the equating and evaluation methods used in this special issue. Although a comprehensive review of all existing models and methodologies would be impractical given the format, a brief introduction to some of the more popular models will be provided. A brief discussion of the conditions required…

Descriptors: Evaluation Methods, Equated Scores, Sample Size, Item Response Theory

Where Are We Now? Learning Progressions and Formative Assessment

Peer reviewed

Direct link

Gotwals, Amelia Wenk – Applied Measurement in Education, 2018

In this commentary, I consider the three empirical studies in this special issue based on two main aspects: (a) the nature of the learning progressions and (b) what formative assessment practice(s) were investigated. Specifically, I describe differences among the learning progressions in terms of scope and grain size. I also identify three…

Descriptors: Skill Development, Behavioral Objectives, Formative Evaluation, Evaluation Methods

In Search of Validity Evidence in Support of the Interpretation and Use of Assessments of Complex Constructs: Discussion of Research on Assessing 21st Century Skills

Peer reviewed

Direct link

Ercikan, Kadriye; Oliveri, María Elena – Applied Measurement in Education, 2016

Assessing complex constructs such as those discussed under the umbrella of 21st century constructs highlights the need for a principled assessment design and validation approach. In our discussion, we made a case for three considerations: (a) taking construct complexity into account across various stages of assessment development such as the…

Descriptors: Evaluation Methods, Test Construction, Design, Scaling

Practical Application of a Synthetic Linking Function on Small-Sample Equating

Peer reviewed

Direct link

Kim, Sooyeon; von Davier, Alina A.; Haberman, Shelby – Applied Measurement in Education, 2011

The synthetic function is a weighted average of the identity (the linking function for forms that are known to be completely parallel) and a traditional equating method. The purpose of the present study was to investigate the benefits of the synthetic function on small-sample equating using various real data sets gathered from different…

Descriptors: Testing Programs, Equated Scores, Investigations, Data Analysis

Application of Evidence-Centered Assessment Design to the Advanced Placement Redesign: A Graphic Restatement

Peer reviewed

Direct link

Bejar, Isaac I. – Applied Measurement in Education, 2010

The foregoing articles constitute what I consider a comprehensive and clear description of the redesign process of a major assessment. The articles serve to illustrate the problems that will need to be addressed by large-scale assessments in the twenty-first century. Primary among them is how to organize the development of such assessments to meet…

Descriptors: Advanced Placement Programs, Equivalency Tests, Evidence, Test Construction

Two Approaches for Identifying Low-Motivated Students in a Low-Stakes Assessment Context

Peer reviewed

Direct link

Swerdzewski, Peter J.; Harmes, J. Christine; Finney, Sara J. – Applied Measurement in Education, 2011

Many universities rely on data gathered from tests that are low stakes for examinees but high stakes for the various programs being assessed. Given the lack of consequences associated with many collegiate assessments, the construct-irrelevant variance introduced by unmotivated students is potentially a serious threat to the validity of the…

Descriptors: Computer Assisted Testing, Student Motivation, Inferences, Universities

The Utility of Augmented Subscores in a Licensure Exam: An Evaluation of Methods Using Empirical Data

Peer reviewed

Direct link

Puhan, Gautam; Sinharay, Sandip; Haberman, Shelby; Larkin, Kevin – Applied Measurement in Education, 2010

Will subscores provide additional information than what is provided by the total score? Is there a method that can estimate more trustworthy subscores than observed subscores? To answer the first question, this study evaluated whether the true subscore was more accurately predicted by the observed subscore or total score. To answer the second…

Descriptors: Licensing Examinations (Professions), Scores, Computation, Methods

Criterion-Focused Approach to Reducing Adverse Impact in College Admissions

Peer reviewed

Direct link

Sinha, Ruchi; Oswald, Frederick; Imus, Anna; Schmitt, Neal – Applied Measurement in Education, 2011

The current study examines how using a multidimensional battery of predictors (high-school grade point average (GPA), SAT/ACT, and biodata), and weighting the predictors based on the different values institutions place on various student performance dimensions (college GPA, organizational citizenship behaviors (OCBs), and behaviorally anchored…

Descriptors: Grade Point Average, Interrater Reliability, Rating Scales, College Admission

Innovations in Measuring Rater Accuracy in Standard Setting: Assessing "Fit" to Item Characteristic Curves

Peer reviewed

Direct link

Hurtz, Gregory M.; Jones, J. Patrick – Applied Measurement in Education, 2009

Standard setting methods such as the Angoff method rely on judgments of item characteristics; item response theory empirically estimates item characteristics and displays them in item characteristic curves (ICCs). This study evaluated several indexes of rater fit to ICCs as a method for judging rater accuracy in their estimates of expected item…

Descriptors: Standard Setting (Scoring), Item Response Theory, Reliability, Measurement

Evidence-Centered Assessment Design as a Foundation for Achievement-Level Descriptor Development and for Standard Setting

Peer reviewed

Direct link

Plake, Barbara S.; Huff, Kristen; Reshetar, Rosemary – Applied Measurement in Education, 2010

In many large-scale assessment programs, achievement level descriptors (ALDs) provide a critical role in communicating what scores on the assessment mean and in interpreting what examinees know and are able to do based on their test performance. Based on their test performance, examinees are often classified into performance categories. The…

Descriptors: Evidence, Test Construction, Measurement, Standard Setting

Validating Measurement of Knowledge Integration in Science Using Multiple-Choice and Explanation Items

Peer reviewed

Direct link

Lee, Hee-Sun; Liu, Ou Lydia; Linn, Marcia C. – Applied Measurement in Education, 2011

This study explores measurement of a construct called knowledge integration in science using multiple-choice and explanation items. We use construct and instructional validity evidence to examine the role multiple-choice and explanation items plays in measuring students' knowledge integration ability. For construct validity, we analyze item…

Descriptors: Knowledge Level, Construct Validity, Validity, Scaffolding (Teaching Technique)

Detecting and Correcting Scale Drift in Test Equating: An Illustration from a Large Scale Testing Program

Peer reviewed

Direct link

Puhan, Gautam – Applied Measurement in Education, 2009

The purpose of this study is to determine the extent of scale drift on a test that employs cut scores. It was essential to examine scale drift for this testing program because new forms in this testing program are often put on scale through a series of intermediate equatings (known as equating chains). This process may cause equating error to…

Descriptors: Testing Programs, Testing, Measurement Techniques, Item Response Theory

Comparisons of Methodologies and Results in Vertical Scaling for Educational Achievement Tests

Peer reviewed

Direct link

Tong, Ye; Kolen, Michael J. – Applied Measurement in Education, 2007

A number of vertical scaling methodologies were examined in this article. Scaling variations included data collection design, scaling method, item response theory (IRT) scoring procedure, and proficiency estimation method. Vertical scales were developed for Grade 3 through Grade 8 for 4 content areas and 9 simulated datasets. A total of 11 scaling…

Descriptors: Achievement Tests, Scaling, Methods, Item Response Theory

Creating IRT-Based Parallel Test Forms Using the Genetic Algorithm Method

Peer reviewed

Direct link

Sun, Koun-Tem; Chen, Yu-Jen; Tsai, Shu-Yen; Cheng, Chien-Fen – Applied Measurement in Education, 2008

In educational measurement, the construction of parallel test forms is often a combinatorial optimization problem that involves the time-consuming selection of items to construct tests having approximately the same test information functions (TIFs) and constraints. This article proposes a novel method, genetic algorithm (GA), to construct parallel…

Descriptors: Test Format, Measurement Techniques, Equations (Mathematics), Item Response Theory

Methodological Approaches to the Validation of Academic Self-Concept: The Construct and Its Measures.

Peer reviewed

Byrne, Barbara M. – Applied Measurement in Education, 1990

Methodological procedures used in validating the theoretical structure of academic self-concept and validating associated measurement instruments are reviewed. Substantive findings from research related to modes of inquiry are summarized, and recommendations for future research are outlined. (TJH)

Descriptors: Classification, Construct Validity, Evaluation Methods, Literature Reviews

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Evaluation Methods	37
Performance Based Assessment	12
Test Construction	11
Educational Assessment	10
Item Response Theory	10
Decision Making	9
Elementary Secondary Education	7
Simulation	7
Standard Setting (Scoring)	7
Standards	7
Test Items	7
Evaluators	6
Models	6
Student Evaluation	6
Test Bias	6
Methods	5
Scores	5
Statistical Analysis	5
Teacher Evaluation	5
Test Validity	5
Academic Achievement	4
Comparative Analysis	4
Cutting Scores	4
Equated Scores	4
Mathematical Models	4
More ▼

Plake, Barbara S.	4
Bejar, Isaac I.	2
Brookhart, Susan M.	2
Haberman, Shelby	2
Puhan, Gautam	2
Su, Ya-Hui	2
Wang, Wen-Chung	2
Ackerman, Terry A.	1
Awuor, Risper	1
Ayala, Carlos C.	1
Baron, Joan Boykoff	1
Bart, William M.	1
Bolt, Daniel M.	1
Boughton, Keith A.	1
Brandon, Paul R.	1
Byrne, Barbara M.	1
Calfee, Robert	1
Chen, Lisue	1
Chen, Yu-Jen	1
Cheng, Chien-Fen	1
Crocker, Linda	1
Crouse, Jill D.	1
Dorans, Neil J.	1
Ercikan, Kadriye	1
More ▼