ERIC - Search Results

Publication Date

In 2025	1
Since 2024	3
Since 2021 (last 5 years)	7
Since 2016 (last 10 years)	11
Since 2006 (last 20 years)	21

Descriptor

Evaluation Methods	39
Educational Assessment	13
Elementary Secondary Education	12
Test Use	12
Test Construction	11
Student Evaluation	10
Computer Assisted Testing	9
Educational Testing	9
Measurement	8
Test Validity	8
Testing Problems	8
Testing Programs	8
Test Interpretation	7
Standardized Tests	5
Test Items	5
Academic Achievement	4
Adaptive Testing	4
Diagnostic Tests	4
Educational Principles	4
Educational Trends	4
Evaluation Criteria	4
Evaluation Utilization	4
Formative Evaluation	4
Scores	4
Testing	4
More ▼

Source

Educational Measurement:…

Publication Type

Journal Articles	39
Reports - Evaluative	14
Reports - Descriptive	13
Reports - Research	7
Opinion Papers	6
Historical Materials	1
Information Analyses	1
Tests/Questionnaires	1

Education Level

Elementary Secondary Education	4
Higher Education	3
Adult Education	2
Postsecondary Education	2

Audience

Location

Nebraska	2
California	1
Hungary	1
Poland	1
USSR	1

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Program for International…	1
Trends in International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 39 results Save | Export

Evolving Educational Testing to Meet Students' Needs: Design-in-Real-Time Assessment

Peer reviewed

Direct link

Stephen G. Sireci; Javier Suárez-Álvarez; April L. Zenisky; Maria Elena Oliveri – Educational Measurement: Issues and Practice, 2024

The goal in personalized assessment is to best fit the needs of each individual test taker, given the assessment purposes. Design-in-Real-Time (DIRTy) assessment reflects the progressive evolution in testing from a single test, to an adaptive test, to an adaptive assessment "system." In this article, we lay the foundation for DIRTy…

Descriptors: Educational Assessment, Student Needs, Test Format, Test Construction

The Good Side of COVID-19

Peer reviewed

Direct link

Bennett, Randy E. – Educational Measurement: Issues and Practice, 2022

This commentary focuses on one of the positive impacts of COVID-19, which was to tie societal inequity to testing in a manner that could motivate the reimagining of our field. That reimagining needs to account for our nation's dramatically changing demographics so that assessment generally, and standardized testing specifically, better fit the…

Descriptors: COVID-19, Pandemics, Social Justice, Testing

Applying a Mixture Rasch Model-Based Approach to Standard Setting

Peer reviewed

Direct link

Peabody, Michael R.; Muckle, Timothy J.; Meng, Yu – Educational Measurement: Issues and Practice, 2023

The subjective aspect of standard-setting is often criticized, yet data-driven standard-setting methods are rarely applied. Therefore, we applied a mixture Rasch model approach to setting performance standards across several testing programs of various sizes and compared the results to existing passing standards derived from traditional…

Descriptors: Item Response Theory, Standard Setting, Testing, Sampling

Setting and Validating Multiple Standards on a Multistage-Adaptive Test

Peer reviewed

Direct link

Lewis, Jennifer; Lim, Hwanggyu; Padellaro, Frank; Sireci, Stephen G.; Zenisky, April L. – Educational Measurement: Issues and Practice, 2022

Setting cut scores on (MSTs) is difficult, particularly when the test spans several grade levels, and the selection of items from MST panels must reflect the operational test specifications. In this study, we describe, illustrate, and evaluate three methods for mapping panelists' Angoff ratings into cut scores on the scale underlying an MST. The…

Descriptors: Cutting Scores, Adaptive Testing, Test Items, Item Analysis

Instruction-Tuned Large-Language Models for Quality Control in Automatic Item Generation: A Feasibility Study

Peer reviewed

Direct link

Guher Gorgun; Okan Bulut – Educational Measurement: Issues and Practice, 2025

Automatic item generation may supply many items instantly and efficiently to assessment and learning environments. Yet, the evaluation of item quality persists to be a bottleneck for deploying generated items in learning and assessment settings. In this study, we investigated the utility of using large-language models, specifically Llama 3-8B, for…

Descriptors: Artificial Intelligence, Quality Control, Technology Uses in Education, Automation

Considerations for Future Online Testing and Assessment in Colleges and Universities

Peer reviewed

Direct link

Middleton, Kyndra V. – Educational Measurement: Issues and Practice, 2022

The onset of the coronavirus pandemic forced schools and universities across the nation and world to close and move to distance learning rather immediately. Almost two years later, colleges and universities have reopened, and most students have returned to campuses, but distance learning still occurs at a much higher rate than before the beginning…

Descriptors: Computer Assisted Testing, Internet, Student Evaluation, College Students

Reframing Research and Assessment Practices: Advancing an Antiracist and Anti-Ableist Research Agenda

Peer reviewed

Direct link

Angela Johnson; Elizabeth Barker; Marcos Viveros Cespedes – Educational Measurement: Issues and Practice, 2024

Educators and researchers strive to build policies and practices on data and evidence, especially on academic achievement scores. When assessment scores are inaccurate for specific student populations or when scores are inappropriately used, even data-driven decisions will be misinformed. To maximize the impact of the research-practice-policy…

Descriptors: Equal Education, Inclusion, Evaluation Methods, Error of Measurement

It's Not Just Angoff: Misperceptions of Hard and Easy Items in Bookmark-Type Ratings

Peer reviewed

Direct link

Wyse, Adam E.; Babcock, Ben – Educational Measurement: Issues and Practice, 2020

A common belief is that the Bookmark method is a cognitively simpler standard-setting method than the modified Angoff method. However, a limited amount of research has investigated panelist's ability to perform well the Bookmark method, and whether some of the challenges panelists face with the Angoff method may also be present in the Bookmark…

Descriptors: Standard Setting (Scoring), Evaluation Methods, Testing Problems, Test Items

Digital Module 07: Subscores--Evaluation and Reporting https://ncme.elevate.commpartners.com

Peer reviewed

Direct link

Sinharay, Sandip – Educational Measurement: Issues and Practice, 2019

Test score users often demand the reporting of subscores due to their potential diagnostic, remedial, and instructional benefits. Therefore, there is substantial pressure on testing programs to report subscores. However, professional standards require that subscores have to satisfy minimum quality standards before they can be reported. In this…

Descriptors: Testing, Scores, Item Response Theory, Evaluation Methods

Assessment for Learning with Diverse Learners in a Digital World

Peer reviewed

Direct link

DiCerbo, Kristen – Educational Measurement: Issues and Practice, 2020

We have the ability to capture data from students' interactions with digital environments as they engage in learning activity. This provides the potential for a reimagining of assessment to one in which assessment become part of our natural education activity and can be used to support learning. These new data allow us to more closely examine the…

Descriptors: Student Diversity, Information Technology, Learning Activities, Learning Processes

Evaluating Content Alignment in Computerized Adaptive Testing

Peer reviewed

Direct link

Wise, Steven L.; Kingsbury, G. Gage; Webb, Norman L. – Educational Measurement: Issues and Practice, 2015

The alignment between a test and the content domain it measures represents key evidence for the validation of test score inferences. Although procedures have been developed for evaluating the content alignment of linear tests, these procedures are not readily applicable to computerized adaptive tests (CATs), which require large item pools and do…

Descriptors: Computer Assisted Testing, Adaptive Testing, Alignment (Education), Test Content

Digital Module 10: Rasch Measurement Theory

Peer reviewed

Direct link

Wang, Jue; Engelhard, George, Jr. – Educational Measurement: Issues and Practice, 2019

In this digital ITEMS module, Dr. Jue Wang and Dr. George Engelhard Jr. describe the Rasch measurement framework for the construction and evaluation of new measures and scales. From a theoretical perspective, they discuss the historical and philosophical perspectives on measurement with a focus on Rasch's concept of specific objectivity and…

Descriptors: Item Response Theory, Evaluation Methods, Measurement, Goodness of Fit

Measurement, Sampling, and Equating Errors in Large-Scale Assessments

Peer reviewed

Direct link

Wu, Margaret – Educational Measurement: Issues and Practice, 2010

In large-scale assessments, such as state-wide testing programs, national sample-based assessments, and international comparative studies, there are many steps involved in the measurement and reporting of student achievement. There are always sources of inaccuracies in each of the steps. It is of interest to identify the source and magnitude of…

Descriptors: Testing Programs, Educational Assessment, Measures (Individuals), Program Effectiveness

An NCME Instructional Module on Using Differential Step Functioning to Refine the Analysis of DIF in Polytomous Items

Peer reviewed

Direct link

Penfield, Randall D.; Gattamorta, Karina; Childs, Ruth A. – Educational Measurement: Issues and Practice, 2009

Traditional methods for examining differential item functioning (DIF) in polytomously scored test items yield a single item-level index of DIF and thus provide no information concerning which score levels are implicated in the DIF effect. To address this limitation of DIF methodology, the framework of differential step functioning (DSF) has…

Descriptors: Test Bias, Test Items, Evaluation Methods, Scores

Introduction of External, Independent Testing in "New Countries": Successes and Defeats of the Introduction of Modern Educational Assessment Techniques in Former Soviet and Socialist Countries

Peer reviewed

Direct link

Bakker, Steven – Educational Measurement: Issues and Practice, 2012

A particular trait of the educational system under socialist reign was accountability at the input side--appropriate facilities, centrally decided curriculum, approved text-books, and uniformly trained teachers--but no control on the output. It was simply assumed that it met the agreed standards, which was, in turn, proven by the statistics…

Descriptors: Accountability, Social Problems, Ethics, Foreign Students

Previous Page | Next Page »

Pages: 1 | 2 | 3

Linn, Robert L.	3
Shepard, Lorrie A.	3
Angela Johnson	1
April L. Zenisky	1
Babcock, Ben	1
Bakker, Steven	1
Bandalos, Deborah L.	1
Bennett, Randy E.	1
Burling, Kelly S.	1
Childs, Ruth A.	1
Connell, Michael L.	1
Csapo, Beno	1
DiCerbo, Kristen	1
Elizabeth Barker	1
Engelhard, George, Jr.	1
Gattamorta, Karina	1
Gong, Brian	1
Guher Gorgun	1
Haertel, Geneva D.	1
Hendrickson, Amy	1
Heritage, Margaret	1
Herman, Joan	1
Hsu, Tse-chi	1
Jaeger, Richard M.	1
More ▼