ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	19
Since 2006 (last 20 years)	34

Descriptor

Comparative Analysis	50
Scores	15
Educational Assessment	9
Foreign Countries	8
Simulation	8
Test Items	8
Test Use	8
Item Response Theory	7
Models	7
Test Construction	7
Test Format	7
College Entrance Examinations	6
Correlation	6
Academic Achievement	5
Elementary Secondary Education	5
Equated Scores	5
Evaluation Methods	5
Educational Testing	4
English (Second Language)	4
Error of Measurement	4
Evaluation	4
Mathematics Tests	4
Performance Based Assessment	4
Second Language Learning	4
Testing Problems	4
More ▼

Source

Educational Measurement:…

Publication Type

Journal Articles	50
Reports - Research	24
Reports - Evaluative	17
Reports - Descriptive	8
Guides - Non-Classroom	2
Speeches/Meeting Papers	2
Information Analyses	1
Opinion Papers	1

Education Level

Higher Education	6
Postsecondary Education	3
Early Childhood Education	2
Elementary Education	2
Elementary Secondary Education	2
Secondary Education	2
Grade 3	1
Grade 4	1
Grade 6	1
High Schools	1
Kindergarten	1
More ▼

Audience

Location

Canada	2
Israel	2
Netherlands	2
United Kingdom	2
Asia	1
Florida	1
Indiana	1
Ireland	1
South Carolina	1
Sweden	1
United States	1
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

SAT (College Admission Test)	2
ACT Assessment	1
Graduate Record Examinations	1
National Assessment of…	1
Program for International…	1
Test of English as a Foreign…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 50 results Save | Export

2023 NCME Presidential Address: Some Musings on Comparable Scores

Peer reviewed

Direct link

Deborah J. Harris – Educational Measurement: Issues and Practice, 2024

This article is based on my 2023 NCME Presidential Address, where I talked a bit about my journey into the profession, and more substantively about comparable scores. Specifically, I discussed some of the different ways 'comparable scores' are defined, highlighted some areas I think we as a profession need to pay more attention to when considering…

Descriptors: Scores, Comparative Analysis, Speeches, Career Development

The Role of Response Style Adjustments in Cross-Country Comparisons--A Case Study Using Data from the PISA 2015 Questionnaire

Peer reviewed

Direct link

Ulitzsch, Esther; Lüdtke, Oliver; Robitzsch, Alexander – Educational Measurement: Issues and Practice, 2023

Country differences in response styles (RS) may jeopardize cross-country comparability of Likert-type scales. When adjusting for rather than investigating RS is the primary goal, it seems advantageous to impose minimal assumptions on RS structures and leverage information from multiple scales for RS measurement. Using PISA 2015 background…

Descriptors: Response Style (Tests), Comparative Analysis, Achievement Tests, Foreign Countries

A Longitudinal Diagnostic Model with Hierarchical Learning Trajectories

Peer reviewed

Direct link

Zhan, Peida; He, Keren – Educational Measurement: Issues and Practice, 2021

In learning diagnostic assessments, the attribute hierarchy specifies a sequential network of interrelated attribute mastery processes, which makes a test blueprint consistent with the cognitive theory. One of the most important functions of attribute hierarchy is to guide or limit the developmental direction of students and then form a…

Descriptors: Longitudinal Studies, Models, Comparative Analysis, Diagnostic Tests

A Comparison of Two Alternate Scaling Approaches Employed for Task Analyses in Credentialing Examination Development

Peer reviewed

Direct link

Fidler, James R.; Risk, Nicole M. – Educational Measurement: Issues and Practice, 2019

Credentialing examination developers rely on task (job) analyses for establishing inventories of task and knowledge areas in which competency is required for safe and successful practice in target occupations. There are many ways in which task-related information may be gathered from practitioner ratings, each with its own advantage and…

Descriptors: Job Analysis, Scaling, Licensing Examinations (Professions), Test Construction

A Model-Data-Fit-Informed Approach to Score Resolution in Performance Assessments

Peer reviewed

Direct link

Wind, Stefanie A.; Walker, A. Adrienne – Educational Measurement: Issues and Practice, 2021

Many large-scale performance assessments include score resolution procedures for resolving discrepancies in rater judgments. The goal of score resolution is conceptually similar to person fit analyses: To identify students for whom observed scores may not accurately reflect their achievement. Previously, researchers have observed that…

Descriptors: Goodness of Fit, Performance Based Assessment, Evaluators, Decision Making

Rubric Rating with MFRM versus Randomly Distributed Comparative Judgment: A Comparison of Two Approaches to Second-Language Writing Assessment

Peer reviewed

Direct link

Sims, Maureen E.; Cox, Troy L.; Eckstein, Grant T.; Hartshorn, K. James; Wilcox, Matthew P.; Hart, Judson M. – Educational Measurement: Issues and Practice, 2020

The purpose of this study is to explore the reliability of a potentially more practical approach to direct writing assessment in the context of ESL writing. Traditional rubric rating (RR) is a common yet resource-intensive evaluation practice when performed reliably. This study compared the traditional rubric model of ESL writing assessment and…

Descriptors: Scoring Rubrics, Item Response Theory, Second Language Learning, English (Second Language)

Systematic Comparison of Decision Accuracy of Complex Compensatory Decision Rules Combining Multiple Tests in a Higher Education Context

Peer reviewed

Direct link

Yocarini, Iris E.; Bouwmeester, Samantha; Smeets, Guus; Arends, Lidia R. – Educational Measurement: Issues and Practice, 2018

This real-data-guided simulation study systematically evaluated the decision accuracy of complex decision rules combining multiple tests within different realistic curricula. Specifically, complex decision rules combining conjunctive aspects and compensatory aspects were evaluated. A conjunctive aspect requires a minimum level of performance,…

Descriptors: Comparative Analysis, Decision Making, Accuracy, Higher Education

A Technical Note on IRT Simulation Studies: Dealing with Truth, Estimates, Observed Data, and Residuals

Peer reviewed

Direct link

Luecht, Richard; Ackerman, Terry A. – Educational Measurement: Issues and Practice, 2018

Simulation studies are extremely common in the item response theory (IRT) research literature. This article presents a didactic discussion of "truth" and "error" in IRT-based simulation studies. We ultimately recommend that future research focus less on the simple recovery of parameters from a convenient generating IRT model,…

Descriptors: Item Response Theory, Simulation, Ethics, Error of Measurement

Reliably Assessing Growth with Longitudinal Diagnostic Classification Models

Peer reviewed

Direct link

Madison, Matthew J. – Educational Measurement: Issues and Practice, 2019

Recent advances have enabled diagnostic classification models (DCMs) to accommodate longitudinal data. These longitudinal DCMs were developed to study how examinees change, or transition, between different attribute mastery statuses over time. This study examines using longitudinal DCMs as an approach to assessing growth and serves three purposes:…

Descriptors: Longitudinal Studies, Item Response Theory, Psychometrics, Criterion Referenced Tests

On Natural Variation in Grades in Higher Education, and Its Implications for Assessing Effectiveness of Educational Innovations

Peer reviewed

Direct link

Boevé, Anja J.; Meijer, Rob R.; Beldhuis, Hans J. A.; Bosker, Roel J.; Albers, Casper J. – Educational Measurement: Issues and Practice, 2019

To investigate the effect of innovations in the teaching-learning environment, researchers often compare study results from different cohorts across years. However, variance in scores can be attributed to both random fluctuation and systematic changes due to the innovation, complicating cohort comparisons. In the present study, we illustrate how…

Descriptors: Grades (Scholastic), Foreign Countries, Teaching Methods, Educational Innovation

An Investigation of Undefined Cut Scores with the Hofstee Standard-Setting Method

Peer reviewed

Direct link

Wyse, Adam E.; Babcock, Ben – Educational Measurement: Issues and Practice, 2017

This article provides an overview of the Hofstee standard-setting method and illustrates several situations where the Hofstee method will produce undefined cut scores. The situations where the cut scores will be undefined involve cases where the line segment derived from the Hofstee ratings does not intersect the score distribution curve based on…

Descriptors: Cutting Scores, Evaluation Methods, Standard Setting (Scoring), Comparative Analysis

On the Choice of Anchor Tests in Equating

Peer reviewed

Direct link

Sinharay, Sandip – Educational Measurement: Issues and Practice, 2018

The choice of anchor tests is crucial in applications of the nonequivalent groups with anchor test design of equating. Sinharay and Holland (2006, 2007) suggested "miditests," which are anchor tests that are content-representative and have the same mean item difficulty as the total test but have a smaller spread of item difficulties.…

Descriptors: Test Content, Difficulty Level, Test Items, Test Construction

How Should Colleges Treat Multiple Admissions Test Scores?

Peer reviewed

Direct link

Mattern, Krista; Radunzel, Justine; Bertling, Maria; Ho, Andrew D. – Educational Measurement: Issues and Practice, 2018

The percentage of students retaking college admissions tests is rising. Researchers and college admissions offices currently use a variety of methods for summarizing these multiple scores. Testing organizations such as ACT and the College Board, interested in validity evidence like correlations with first-year grade point average (FYGPA), often…

Descriptors: College Admission, Scores, Correlation, College Entrance Examinations

An NCME Instructional Module on Latent DIF Analysis Using Mixture Item Response Models

Peer reviewed

Direct link

Cho, Sun-Joo; Suh, Youngsuk; Lee, Woo-yeol – Educational Measurement: Issues and Practice, 2016

The purpose of this ITEMS module is to provide an introduction to differential item functioning (DIF) analysis using mixture item response models. The mixture item response models for DIF analysis involve comparing item profiles across latent groups, instead of manifest groups. First, an overview of DIF analysis based on latent groups, called…

Descriptors: Test Bias, Research Methodology, Evaluation Methods, Models

Speed Gaps: Exploring Differences in Response Latencies among Groups

Peer reviewed

Direct link

Wright, Daniel B. – Educational Measurement: Issues and Practice, 2019

There is much discussion about and many policies to address achievement gaps in education among groups of students. The focus here is on a different gap and it is argued that it also should be of concern. Speed gaps are differences in how quickly different groups of students answer the questions on academic assessments. To investigate some speed…

Descriptors: Academic Achievement, Achievement Gap, Reaction Time, Educational Testing

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Ho, Andrew D.	2
Wyse, Adam E.	2
Ackerman, Terry A.	1
Albers, Casper J.	1
Allen, Jessica	1
Arends, Lidia R.	1
Babcock, Ben	1
Beldhuis, Hans J. A.	1
Bertling, Maria	1
Boevé, Anja J.	1
Bosker, Roel J.	1
Bouwmeester, Samantha	1
Bridgeman, Brent	1
Buckendahl, Chad W.	1
Cai, Li	1
Cameron, Catherine A.	1
Castellano, Katherine E.	1
Chapelle, Carol A.	1
Childs, Ruth A.	1
Cho, Sun-Joo	1
Cox, Troy L.	1
Crisp, Victoria	1
Deborah J. Harris	1
Downing, Steven M.	1
More ▼