ERIC - Search Results

Publication Date

In 2025	1
Since 2024	3
Since 2021 (last 5 years)	3
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	10

Descriptor

Evaluation Methods	22
Test Construction	22
Student Evaluation	7
Educational Assessment	6
Test Use	5
Computer Assisted Testing	4
Measurement	4
Program Evaluation	4
Test Validity	4
Validity	4
Elementary Secondary Education	3
Instructional Materials	3
School Districts	3
Test Items	3
Academic Achievement	2
Adaptive Testing	2
Educational Testing	2
Ethics	2
Evaluation Criteria	2
Evaluation Utilization	2
Measurement Techniques	2
Models	2
Performance Based Assessment	2
Quality Control	2
Responsibility	2
More ▼

Source

Educational Measurement:…

Publication Type

Journal Articles	22
Reports - Descriptive	9
Reports - Evaluative	9
Information Analyses	3
Reports - Research	2
Tests/Questionnaires	2
Guides - Non-Classroom	1
Opinion Papers	1
Speeches/Meeting Papers	1

Education Level

Adult Education

Audience

Location

Nebraska

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment

What Works Clearinghouse Rating

Showing 1 to 15 of 22 results Save | Export

Instruction-Tuned Large-Language Models for Quality Control in Automatic Item Generation: A Feasibility Study

Peer reviewed

Direct link

Guher Gorgun; Okan Bulut – Educational Measurement: Issues and Practice, 2025

Automatic item generation may supply many items instantly and efficiently to assessment and learning environments. Yet, the evaluation of item quality persists to be a bottleneck for deploying generated items in learning and assessment settings. In this study, we investigated the utility of using large-language models, specifically Llama 3-8B, for…

Descriptors: Artificial Intelligence, Quality Control, Technology Uses in Education, Automation

Reframing Research and Assessment Practices: Advancing an Antiracist and Anti-Ableist Research Agenda

Peer reviewed

Direct link

Angela Johnson; Elizabeth Barker; Marcos Viveros Cespedes – Educational Measurement: Issues and Practice, 2024

Educators and researchers strive to build policies and practices on data and evidence, especially on academic achievement scores. When assessment scores are inaccurate for specific student populations or when scores are inappropriately used, even data-driven decisions will be misinformed. To maximize the impact of the research-practice-policy…

Descriptors: Equal Education, Inclusion, Evaluation Methods, Error of Measurement

Embedded Standard Setting: Aligning Standard-Setting Methodology with Contemporary Assessment Design Principles

Peer reviewed

Direct link

Lewis, Daniel; Cook, Robert – Educational Measurement: Issues and Practice, 2020

In this paper we assert that the practice of principled assessment design renders traditional standard-setting methodology redundant at best and contradictory at worst. We describe the rationale for, and methodological details of, Embedded Standard Setting (ESS; previously, Engineered Cut Scores. Lewis, 2016), an approach to establish performance…

Descriptors: Standard Setting, Evaluation, Cutting Scores, Performance Based Assessment

Evolving Educational Testing to Meet Students' Needs: Design-in-Real-Time Assessment

Peer reviewed

Direct link

Stephen G. Sireci; Javier Suárez-Álvarez; April L. Zenisky; Maria Elena Oliveri – Educational Measurement: Issues and Practice, 2024

The goal in personalized assessment is to best fit the needs of each individual test taker, given the assessment purposes. Design-in-Real-Time (DIRTy) assessment reflects the progressive evolution in testing from a single test, to an adaptive test, to an adaptive assessment "system." In this article, we lay the foundation for DIRTy…

Descriptors: Educational Assessment, Student Needs, Test Format, Test Construction

There Is More to Educational Measurement than Measuring: The Importance of Embracing Purpose Pluralism

Peer reviewed

Direct link

Newton, Paul E. – Educational Measurement: Issues and Practice, 2017

The dominant narrative for assessment design seems to reflect a strong, albeit largely implicit undercurrent of purpose purism, which idealizes the principle that assessment design should be driven by a single assessment purpose. With a particular focus on achievement assessments, the present article questions the tenability of purpose purism,…

Descriptors: Evaluation Methods, Test Construction, Instructional Design, Decision Making

Digital Module 10: Rasch Measurement Theory

Peer reviewed

Direct link

Wang, Jue; Engelhard, George, Jr. – Educational Measurement: Issues and Practice, 2019

In this digital ITEMS module, Dr. Jue Wang and Dr. George Engelhard Jr. describe the Rasch measurement framework for the construction and evaluation of new measures and scales. From a theoretical perspective, they discuss the historical and philosophical perspectives on measurement with a focus on Rasch's concept of specific objectivity and…

Descriptors: Item Response Theory, Evaluation Methods, Measurement, Goodness of Fit

An NCME Instructional Module on Multistage Testing

Peer reviewed

Direct link

Hendrickson, Amy – Educational Measurement: Issues and Practice, 2007

Multistage tests are those in which sets of items are administered adaptively and are scored as a unit. These tests have all of the advantages of adaptive testing, with more efficient and precise measurement across the proficiency scale as well as time savings, without many of the disadvantages of an item-level adaptive test. As a seemingly…

Descriptors: Adaptive Testing, Test Construction, Measurement Techniques, Evaluation Methods

A Framework for Evaluating and Planning Assessments Intended to Improve Student Achievement

Peer reviewed

Direct link

Nichols, Paul D.; Meyers, Jason L.; Burling, Kelly S. – Educational Measurement: Issues and Practice, 2009

Assessments labeled as formative have been offered as a means to improve student achievement. But labels can be a powerful way to miscommunicate. For an assessment use to be appropriately labeled "formative," both empirical evidence and reasoned arguments must be offered to support the claim that improvements in student achievement can be linked…

Descriptors: Academic Achievement, Tutoring, Student Evaluation, Evaluation Methods

Implications of Evidence-Centered Design for Educational Testing

Peer reviewed

Direct link

Mislevy, Robert J.; Haertel, Geneva D. – Educational Measurement: Issues and Practice, 2006

Evidence-centered assessment design (ECD) provides language, concepts, and knowledge representations for designing and delivering educational assessments, all organized around the evidentiary argument an assessment is meant to embody. This article describes ECD in terms of layers for analyzing domains, laying out arguments, creating schemas for…

Descriptors: Educational Testing, Test Construction, Evaluation Methods, Computer Simulation

Consequential Validity from the Test Developer's Perspective.

Peer reviewed

Reckase, Mark D. – Educational Measurement: Issues and Practice, 1998

Considers what a responsible test developer would do to gain information to support the consequential basis of validity for a test early in the development. How the consequential basis of validity of the program would be monitored and reported during the life of the program is examined. The validity of the ACT Assessment is considered as if it…

Descriptors: Evaluation Methods, Program Evaluation, Test Construction, Validity

Test Design with Cognition in Mind

Peer reviewed

Direct link

Gorin, Joanna S. – Educational Measurement: Issues and Practice, 2006

One of the primary themes of the National Research Council's 2001 book "Knowing What Students Know" was the importance of cognition as a component of assessment design and measurement theory (NRC, 2001). One reaction to the book has been an increased use of sophisticated statistical methods to model cognitive information available in test data.…

Descriptors: Test Construction, Student Evaluation, Academic Ability, Evaluation Methods

Investigating the Consequential Aspects of Validity: Who Is Responsible and What Should They Do?

Peer reviewed

Yen, Wendy M. – Educational Measurement: Issues and Practice, 1998

The articles in this issue, written from the perspectives of academics, practitioners, and publishers, show that examining the consequences of assessment is an important, large, and difficult task. Collaborative action by assessment developers, users, and the educational measurement community is needed if progress is to be made. (SLD)

Descriptors: Cooperation, Evaluation Methods, Program Evaluation, Responsibility

The Role of Consequences in Validity Theory.

Peer reviewed

Moss, Pamela A. – Educational Measurement: Issues and Practice, 1998

Provides an argument for incorporating consideration of consequences into validity theory that is grounded in the reflexive nature of social knowledge. It also calls for the consideration of evidence of validity based on the actual discourse surrounding the practices and products of testing. (SLD)

Descriptors: Evaluation Methods, Evaluation Utilization, Program Evaluation, Test Construction

Code of Fair Testing Practices in Education (Revised)

Peer reviewed

Direct link

Educational Measurement: Issues and Practice, 2005

A note from the Working Group of the Joint Committee on Testing Practices: The "Code of Fair Testing Practices in Education (Code)" prepared by the Joint Committee on Testing Practices (JCTP) has just been revised for the first time since its initial introduction in 1988. The revision of the Code was inspired primarily by the revision of…

Descriptors: Measurement, Psychological Testing, Test Use, Student Evaluation

Is the Curriculum a Reasonable Basis for Assessment Reform?

Peer reviewed

Nitko, Anthony J. – Educational Measurement: Issues and Practice, 1995

If curriculum is to be the basis for assessment reform, assessment specialists must model the process for producing valid assessment products. Validity criteria should guide any model for the assessment development process. However, curriculum-based assessment systems should not be confused with standards-driven assessment systems. (SLD)

Descriptors: Criteria, Curriculum Based Assessment, Educational Change, Evaluation Methods

Previous Page | Next Page »

Pages: 1 | 2

Nitko, Anthony J.	2
Angela Johnson	1
April L. Zenisky	1
Burling, Kelly S.	1
Cook, Robert	1
Elizabeth Barker	1
Engelhard, George, Jr.	1
Gorin, Joanna S.	1
Guher Gorgun	1
Haertel, Geneva D.	1
Hendrickson, Amy	1
Hsu, Tse-chi	1
Javier Suárez-Álvarez	1
Kilian, Lawrence J.	1
Lewis, Daniel	1
Linn, Robert L.	1
Marcos Viveros Cespedes	1
Maria Elena Oliveri	1
Meyers, Jason L.	1
Mislevy, Robert J.	1
Moss, Pamela A.	1
Newton, Paul E.	1
Nichols, Paul D.	1
Okan Bulut	1
Popham, W. James	1
More ▼