ERIC - Search Results

Publication Date

In 2025	1
Since 2024	4
Since 2021 (last 5 years)	16
Since 2016 (last 10 years)	43
Since 2006 (last 20 years)	99

Descriptor

Evaluation Methods	147
Test Items	147
Test Validity	104
Test Construction	56
Test Reliability	36
Student Evaluation	35
Validity	31
Item Analysis	26
Scores	26
Foreign Countries	25
Psychometrics	25
Difficulty Level	19
Test Bias	18
Item Response Theory	17
Construct Validity	16
Achievement Tests	15
Measurement Techniques	15
Elementary Secondary Education	14
Educational Assessment	13
Multiple Choice Tests	13
Correlation	12
Models	12
Second Language Learning	11
Teaching Methods	11
Factor Analysis	10
More ▼

Education Level

Higher Education	15
Elementary Secondary Education	13
Secondary Education	13
Postsecondary Education	12
Elementary Education	11
High Schools	8
Grade 8	6
Middle Schools	5
Junior High Schools	4
Early Childhood Education	3
Grade 10	3
Grade 5	3
Preschool Education	3
Grade 6	2
Grade 7	2
Grade 9	2
Intermediate Grades	2
Adult Education	1
Grade 11	1
Grade 12	1
Grade 2	1
Grade 4	1
More ▼

Audience

Practitioners	6
Teachers	6
Administrators	4
Support Staff	3
Researchers	2
Students	2
Community	1
Parents	1
Policymakers	1

Location

Germany	3
United Kingdom	3
Australia	2
California	2
Massachusetts	2
Turkey	2
United Kingdom (England)	2
Canada	1
Colorado	1
Dominica	1
Egypt	1
Grenada	1
Illinois (Chicago)	1
India	1
Indonesia	1
Iran	1
Italy	1
Mississippi	1
North Carolina	1
North Carolina (Charlotte)	1
Philippines	1
Saint Lucia	1
Saint Vincent and the…	1
South Korea	1
More ▼

Laws, Policies, & Programs

Every Student Succeeds Act…	3
Individuals with Disabilities…	3
No Child Left Behind Act 2001	3
Rehabilitation Act 1973…	3

Assessments and Surveys

National Assessment of…	3
Dynamic Indicators of Basic…	2
Trends in International…	2
Flesch Kincaid Grade Level…	1
Graduate Record Examinations	1
Hidden Figures Test	1
International English…	1
Maslach Burnout Inventory	1
Massachusetts Comprehensive…	1
Mayer Salovey Caruso…	1
Pennsylvania Educational…	1
Rokeach Value Survey	1
SAT (College Admission Test)	1
Social Skills Improvement…	1
Stanford Achievement Tests	1
Strong Campbell Interest…	1
Test of English as a Foreign…	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 147 results Save | Export

Evaluating Methodological Enhancements to the Yes/No Angoff Standard-Setting Method in Language Proficiency Assessment

Peer reviewed

Direct link

Tia M. Fechter; Heeyeon Yoon – Language Testing, 2024

This study evaluated the efficacy of two proposed methods in an operational standard-setting study conducted for a high-stakes language proficiency test of the U.S. government. The goal was to seek low-cost modifications to the existing Yes/No Angoff method to increase the validity and reliability of the recommended cut scores using a convergent…

Descriptors: Standard Setting, Language Proficiency, Language Tests, Evaluation Methods

Validating a Novel Digital Performance-Based Assessment of Data Literacy: Psychometric and Eye-Tracking Analyses

Peer reviewed

Direct link

Fu Chen; Ying Cui; Alina Lutsyk-King; Yizhu Gao; Xiaoxiao Liu; Maria Cutumisu; Jacqueline P. Leighton – Education and Information Technologies, 2024

Post-secondary data literacy education is critical to students' academic and career success. However, the literature has not adequately addressed the conceptualization and assessment of data literacy for post-secondary students. In this study, we introduced a novel digital performance-based assessment for teaching and evaluating post-secondary…

Descriptors: Performance Based Assessment, College Students, Information Literacy, Evaluation Methods

Development and Validation of a Scenario-Based Teacher Language Assessment Literacy Test

Peer reviewed
PDF on ERIC

Download full text

Anani Sarab, Mohammad Reza; Rahmani, Simindokht – International Journal of Language Testing, 2023

Language testing and assessment have grown in popularity and gained significance in the last few decades, and there is a rising need for assessment literate stakeholders in the field of language education. As teachers play a major role in assessing students, there is a need to make sure they have the right level of assessment knowledge and skills…

Descriptors: Language Tests, Literacy, Second Language Learning, Factor Analysis

Setting and Validating Multiple Standards on a Multistage-Adaptive Test

Peer reviewed

Direct link

Lewis, Jennifer; Lim, Hwanggyu; Padellaro, Frank; Sireci, Stephen G.; Zenisky, April L. – Educational Measurement: Issues and Practice, 2022

Setting cut scores on (MSTs) is difficult, particularly when the test spans several grade levels, and the selection of items from MST panels must reflect the operational test specifications. In this study, we describe, illustrate, and evaluate three methods for mapping panelists' Angoff ratings into cut scores on the scale underlying an MST. The…

Descriptors: Cutting Scores, Adaptive Testing, Test Items, Item Analysis

Instruction-Tuned Large-Language Models for Quality Control in Automatic Item Generation: A Feasibility Study

Peer reviewed

Direct link

Guher Gorgun; Okan Bulut – Educational Measurement: Issues and Practice, 2025

Automatic item generation may supply many items instantly and efficiently to assessment and learning environments. Yet, the evaluation of item quality persists to be a bottleneck for deploying generated items in learning and assessment settings. In this study, we investigated the utility of using large-language models, specifically Llama 3-8B, for…

Descriptors: Artificial Intelligence, Quality Control, Technology Uses in Education, Automation

The Concurrent Validity of Comparative Judgement Outcomes Compared with Marks

Download full text

Gill, Tim – Research Matters, 2022

In Comparative Judgement (CJ) exercises, examiners are asked to look at a selection of candidate scripts (with marks removed) and order them in terms of which they believe display the best quality. By including scripts from different examination sessions, the results of these exercises can be used to help with maintaining standards. Results from…

Descriptors: Comparative Analysis, Decision Making, Scripts, Standards

Disrupted Data: Using Longitudinal Assessment Systems to Monitor Test Score Quality

Peer reviewed

Direct link

An, Lily Shiao; Ho, Andrew Dean; Davis, Laurie Laughlin – Educational Measurement: Issues and Practice, 2022

Technical documentation for educational tests focuses primarily on properties of individual scores at single points in time. Reliability, standard errors of measurement, item parameter estimates, fit statistics, and linking constants are standard technical features that external stakeholders use to evaluate items and individual scale scores.…

Descriptors: Documentation, Scores, Evaluation Methods, Longitudinal Studies

Examining the Validity of a Generative Education Pattern Based Question

Peer reviewed
PDF on ERIC

Download full text

Karen Leary Duseau – North American Chapter of the International Group for the Psychology of Mathematics Education, 2023

Assessment is a topic of concern to all stakeholders in our educational system. Pattern Based Questions are an assessment tool which is an alternative to the standardized assessment tool, and they are based on generative learning pedagogy, which shows promise in engaging all learners and usefulness in teaching and learning but validity has not yet…

Descriptors: Undergraduate Students, College Mathematics, Mathematics Skills, Thinking Skills

Examining the Influence of Item Exposure and Retrieval Practice Effects on Test Performance in a Large-Scale Workforce Development Training Programme

Peer reviewed

Direct link

Philomina Abena Anyidoho; Rebecca Berenbon; Bridget McHugh – International Journal of Training and Development, 2024

Many workforce development training programmes use learning gains as a measure of programme effectiveness. However, research on K-12 education suggests that posttest scores may be influenced by pretesting effects. Pretesting may improve posttest performance by giving learners preknowledge of posttest content. Alternatively, pretesting may enhance…

Descriptors: Trainees, Trainers, Labor Force Development, High Stakes Tests

Modeling Mediation in the Dynamic Assessment of Listening Ability from the Cognitive Diagnostic Perspective

Peer reviewed

Direct link

Meng, Yaru; Fu, Hua – Modern Language Journal, 2023

The distinguishing feature of dynamic assessment (DA) is the dialectical integration of assessment and instruction. However, how to design the targeted instruction or mediation has been relatively underexplored. To address this gap, this study proposes the attribute-based mediation model (AMM), an English-as-a-foreign-language listening mediation…

Descriptors: Evaluation Methods, Teaching Methods, Models, English (Second Language)

Assessment of Basic Competencies in Adults: Item Pool Validity and Reliability Study

Peer reviewed
PDF on ERIC

Download full text

Toker, Turker – International Journal of Curriculum and Instruction, 2023

Achievement tests are among the most widely used data collection tools to measure the knowledge and skill levels of individuals. For this reason, the existence of valid and reliable achievement tests that can perfectly reveal the competencies that a person should have in any discipline is of great importance. The purpose of this research is to…

Descriptors: Basic Skills, Evaluation Methods, Test Items, Test Validity

Practical Online Assessment of Mathematical Proof

Peer reviewed

Direct link

Thomas Bickerton, Robert; Sangwin, Chris J. – International Journal of Mathematical Education in Science and Technology, 2022

We discuss a practical method for assessing mathematical proof online. We examine the use of faded worked examples and reading comprehension questions to understand proof. By breaking down a given proof, we formulate a checklist that can be used to generate comprehension questions which can be assessed automatically online. We then provide some…

Descriptors: Mathematics Instruction, Validity, Mathematical Logic, Evaluation Methods

Ensuring Fairness in Difficulty and Content among Parallel Assessments Generated from a Test-Item Database

Download full text

Parry, James R. – Online Submission, 2020

This paper presents research and provides a method to ensure that parallel assessments, that are generated from a large test-item database, maintain equitable difficulty and content coverage each time the assessment is presented. To maintain fairness and validity it is important that all instances of an assessment, that is intended to test the…

Descriptors: Culture Fair Tests, Difficulty Level, Test Items, Test Validity

Comparison of DIF Methods for the Student Experience in the Research University Survey: A Validity and Methodological Study

Direct link

Thapelo Ncube Whitfield – ProQuest LLC, 2021

Student Experience surveys are used to measure student attitudes towards their campus as well as to initiate conversations for institutional change. Validity evidence to support the interpretations of these surveys' results, however, is lacking. The first purpose of this study was to compare three Differential Item Functioning (DIF) methods on…

Descriptors: College Students, Student Surveys, Student Experience, Student Attitudes

Adapting Paper-Based Tests for Computer Administration: Lessons Learned from 30 Years of Mode Effects Studies in Education

Peer reviewed
PDF on ERIC

Download full text

Lynch, Sarah – Practical Assessment, Research & Evaluation, 2022

In today's digital age, tests are increasingly being delivered on computers. Many of these computer-based tests (CBTs) have been adapted from paper-based tests (PBTs). However, this change in mode of test administration has the potential to introduce construct-irrelevant variance, affecting the validity of score interpretations. Because of this,…

Descriptors: Computer Assisted Testing, Tests, Scores, Scoring

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10

Educational and Psychological…	5
Educational Measurement:…	4
Online Submission	4
Grantee Submission	3
Journal of Educational…	3
Measurement:…	3
Practical Assessment,…	3
Smarter Balanced Assessment…	3
Applied Psychological…	2
Early Education and…	2
Educational Assessment	2
International Journal of…	2
International Journal of…	2
Journal of Chemical Education	2
Journal of Research in…	2
Language Testing	2
ProQuest LLC	2
Psychological Assessment	2
Psychology Teaching Review	2
Research Matters	2
Research Quarterly for…	2
Alberta Journal of…	1
Applied Measurement in…	1
Assessment	1
Assessment and Accountability…	1
More ▼

Sireci, Stephen G.	4
Abedi, Jamal	3
Hambleton, Ronald K.	3
Hill, Heather C.	3
Blunk, Merrie	2
Bowles, Ryan P.	2
Burts, Diane C.	2
Fischer, Hans E.	2
Goffney, Imani Masters	2
Goodwin, Sarah	2
Herman, Joan L.	2
Kim, Do-Hong	2
Konishi, Haruka	2
Lambert, Richard G.	2
Liu, Xiufeng	2
Merz, William R.	2
Romine, William L.	2
Skibbe, Lori E.	2
Todd, Amber	2
Troia, Gary A.	2
Ackerman, Terry A.	1
Ahmed, Wondimu	1
Akarsu, Bayram	1
Akbari, Alireza	1
More ▼

Journal Articles	93
Reports - Research	75
Reports - Evaluative	31
Reports - Descriptive	19
Speeches/Meeting Papers	15
Tests/Questionnaires	8
Guides - Non-Classroom	7
Opinion Papers	6
Information Analyses	4
Books	2
Collected Works - General	2
Dissertations/Theses -…	2
Guides - General	2
Dissertations/Theses -…	1
ERIC Digests in Full Text	1
ERIC Publications	1
Guides - Classroom - Learner	1
Guides - Classroom - Teacher	1
Non-Print Media	1
Reference Materials - General	1
Reports - General	1
More ▼