ERIC - Search Results

Publication Date

In 2025	1
Since 2024	4
Since 2021 (last 5 years)	11
Since 2016 (last 10 years)	26
Since 2006 (last 20 years)	66

Descriptor

Evaluation Methods	104
Test Items	104
Test Validity	104
Test Construction	42
Test Reliability	36
Student Evaluation	25
Psychometrics	21
Scores	17
Item Analysis	16
Foreign Countries	15
Test Bias	15
Difficulty Level	14
Elementary Secondary Education	12
Measurement Techniques	11
Achievement Tests	10
Standardized Tests	10
Educational Assessment	9
Item Response Theory	9
Multiple Choice Tests	9
College Students	8
Computer Assisted Testing	8
Teaching Methods	8
Test Results	8
Test Use	8
Testing	8
More ▼

Publication Type

Journal Articles	62
Reports - Research	47
Reports - Evaluative	24
Reports - Descriptive	14
Speeches/Meeting Papers	10
Guides - Non-Classroom	6
Opinion Papers	6
Tests/Questionnaires	6
Guides - General	2
Information Analyses	2
Books	1
Dissertations/Theses -…	1
Dissertations/Theses -…	1
ERIC Digests in Full Text	1
ERIC Publications	1
Guides - Classroom - Learner	1
Guides - Classroom - Teacher	1
Reports - General	1
More ▼

Education Level

Higher Education	12
Elementary Secondary Education	11
Postsecondary Education	9
Elementary Education	7
Secondary Education	6
High Schools	5
Grade 8	4
Middle Schools	4
Early Childhood Education	3
Grade 10	3
Grade 5	3
Junior High Schools	3
Grade 6	2
Grade 7	2
Grade 9	2
Preschool Education	2
Adult Education	1
Grade 11	1
Grade 12	1
Intermediate Grades	1
More ▼

Audience

Teachers	6
Practitioners	5
Administrators	3
Support Staff	3
Researchers	2
Students	2
Community	1
Parents	1

Location

California	2
Germany	2
Turkey	2
Australia	1
Colorado	1
Dominica	1
Egypt	1
Grenada	1
India	1
Iran	1
Italy	1
Massachusetts	1
Mississippi	1
North Carolina	1
Philippines	1
Saint Lucia	1
Saint Vincent and the…	1
South Korea	1
United Kingdom	1
More ▼

Laws, Policies, & Programs

Every Student Succeeds Act…	3
Individuals with Disabilities…	3
Rehabilitation Act 1973…	3
No Child Left Behind Act 2001	2

Assessments and Surveys

Dynamic Indicators of Basic…	2
National Assessment of…	2
Hidden Figures Test	1
International English…	1
Maslach Burnout Inventory	1
Massachusetts Comprehensive…	1
Pennsylvania Educational…	1
SAT (College Admission Test)	1
Stanford Achievement Tests	1
Test of English as a Foreign…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 104 results Save | Export

Evaluating Methodological Enhancements to the Yes/No Angoff Standard-Setting Method in Language Proficiency Assessment

Peer reviewed

Direct link

Tia M. Fechter; Heeyeon Yoon – Language Testing, 2024

This study evaluated the efficacy of two proposed methods in an operational standard-setting study conducted for a high-stakes language proficiency test of the U.S. government. The goal was to seek low-cost modifications to the existing Yes/No Angoff method to increase the validity and reliability of the recommended cut scores using a convergent…

Descriptors: Standard Setting, Language Proficiency, Language Tests, Evaluation Methods

Validating a Novel Digital Performance-Based Assessment of Data Literacy: Psychometric and Eye-Tracking Analyses

Peer reviewed

Direct link

Fu Chen; Ying Cui; Alina Lutsyk-King; Yizhu Gao; Xiaoxiao Liu; Maria Cutumisu; Jacqueline P. Leighton – Education and Information Technologies, 2024

Post-secondary data literacy education is critical to students' academic and career success. However, the literature has not adequately addressed the conceptualization and assessment of data literacy for post-secondary students. In this study, we introduced a novel digital performance-based assessment for teaching and evaluating post-secondary…

Descriptors: Performance Based Assessment, College Students, Information Literacy, Evaluation Methods

Instruction-Tuned Large-Language Models for Quality Control in Automatic Item Generation: A Feasibility Study

Peer reviewed

Direct link

Guher Gorgun; Okan Bulut – Educational Measurement: Issues and Practice, 2025

Automatic item generation may supply many items instantly and efficiently to assessment and learning environments. Yet, the evaluation of item quality persists to be a bottleneck for deploying generated items in learning and assessment settings. In this study, we investigated the utility of using large-language models, specifically Llama 3-8B, for…

Descriptors: Artificial Intelligence, Quality Control, Technology Uses in Education, Automation

Disrupted Data: Using Longitudinal Assessment Systems to Monitor Test Score Quality

Peer reviewed

Direct link

An, Lily Shiao; Ho, Andrew Dean; Davis, Laurie Laughlin – Educational Measurement: Issues and Practice, 2022

Technical documentation for educational tests focuses primarily on properties of individual scores at single points in time. Reliability, standard errors of measurement, item parameter estimates, fit statistics, and linking constants are standard technical features that external stakeholders use to evaluate items and individual scale scores.…

Descriptors: Documentation, Scores, Evaluation Methods, Longitudinal Studies

Examining the Influence of Item Exposure and Retrieval Practice Effects on Test Performance in a Large-Scale Workforce Development Training Programme

Peer reviewed

Direct link

Philomina Abena Anyidoho; Rebecca Berenbon; Bridget McHugh – International Journal of Training and Development, 2024

Many workforce development training programmes use learning gains as a measure of programme effectiveness. However, research on K-12 education suggests that posttest scores may be influenced by pretesting effects. Pretesting may improve posttest performance by giving learners preknowledge of posttest content. Alternatively, pretesting may enhance…

Descriptors: Trainees, Trainers, Labor Force Development, High Stakes Tests

Modeling Mediation in the Dynamic Assessment of Listening Ability from the Cognitive Diagnostic Perspective

Peer reviewed

Direct link

Meng, Yaru; Fu, Hua – Modern Language Journal, 2023

The distinguishing feature of dynamic assessment (DA) is the dialectical integration of assessment and instruction. However, how to design the targeted instruction or mediation has been relatively underexplored. To address this gap, this study proposes the attribute-based mediation model (AMM), an English-as-a-foreign-language listening mediation…

Descriptors: Evaluation Methods, Teaching Methods, Models, English (Second Language)

Assessment of Basic Competencies in Adults: Item Pool Validity and Reliability Study

Peer reviewed
PDF on ERIC

Download full text

Toker, Turker – International Journal of Curriculum and Instruction, 2023

Achievement tests are among the most widely used data collection tools to measure the knowledge and skill levels of individuals. For this reason, the existence of valid and reliable achievement tests that can perfectly reveal the competencies that a person should have in any discipline is of great importance. The purpose of this research is to…

Descriptors: Basic Skills, Evaluation Methods, Test Items, Test Validity

Ensuring Fairness in Difficulty and Content among Parallel Assessments Generated from a Test-Item Database

Download full text

Parry, James R. – Online Submission, 2020

This paper presents research and provides a method to ensure that parallel assessments, that are generated from a large test-item database, maintain equitable difficulty and content coverage each time the assessment is presented. To maintain fairness and validity it is important that all instances of an assessment, that is intended to test the…

Descriptors: Culture Fair Tests, Difficulty Level, Test Items, Test Validity

Comparison of DIF Methods for the Student Experience in the Research University Survey: A Validity and Methodological Study

Direct link

Thapelo Ncube Whitfield – ProQuest LLC, 2021

Student Experience surveys are used to measure student attitudes towards their campus as well as to initiate conversations for institutional change. Validity evidence to support the interpretations of these surveys' results, however, is lacking. The first purpose of this study was to compare three Differential Item Functioning (DIF) methods on…

Descriptors: College Students, Student Surveys, Student Experience, Student Attitudes

Adapting Paper-Based Tests for Computer Administration: Lessons Learned from 30 Years of Mode Effects Studies in Education

Peer reviewed
PDF on ERIC

Download full text

Lynch, Sarah – Practical Assessment, Research & Evaluation, 2022

In today's digital age, tests are increasingly being delivered on computers. Many of these computer-based tests (CBTs) have been adapted from paper-based tests (PBTs). However, this change in mode of test administration has the potential to introduce construct-irrelevant variance, affecting the validity of score interpretations. Because of this,…

Descriptors: Computer Assisted Testing, Tests, Scores, Scoring

Development and Validation of a Scenario-Based Teacher Language Assessment Literacy Test

Peer reviewed
PDF on ERIC

Download full text

Anani Sarab, Mohammad Reza; Rahmani, Simindokht – International Journal of Language Testing, 2023

Language testing and assessment have grown in popularity and gained significance in the last few decades, and there is a rising need for assessment literate stakeholders in the field of language education. As teachers play a major role in assessing students, there is a need to make sure they have the right level of assessment knowledge and skills…

Descriptors: Language Tests, Literacy, Second Language Learning, Factor Analysis

A Cognitive Diagnostic Assessment Study of the Reading Comprehension Section of the Preliminary English Test (PET)

Peer reviewed
PDF on ERIC

Download full text

Mohammed, Aisha; Dawood, Abdul Kareem Shareef; Alghazali, Tawfeeq; Kadhim, Qasim Khlaif; Sabti, Ahmed Abdulateef; Sabit, Shaker Holh – International Journal of Language Testing, 2023

Cognitive diagnostic models (CDMs) have received much interest within the field of language testing over the last decade due to their great potential to provide diagnostic feedback to all stakeholders and ultimately improve language teaching and learning. A large number of studies have demonstrated the application of CDMs on advanced large-scale…

Descriptors: Reading Comprehension, Reading Tests, Language Tests, English (Second Language)

An Experimental Study of the Internal Consistency of Judgments Made in Bookmark Standard Setting

Peer reviewed

Direct link

Clauser, Brian E.; Baldwin, Peter; Margolis, Melissa J.; Mee, Janet; Winward, Marcia – Journal of Educational Measurement, 2017

Validating performance standards is challenging and complex. Because of the difficulties associated with collecting evidence related to external criteria, validity arguments rely heavily on evidence related to internal criteria--especially evidence that expert judgments are internally consistent. Given its importance, it is somewhat surprising…

Descriptors: Evaluation Methods, Standard Setting, Cutting Scores, Expertise

The Access to Literacy Assessment System for Phonological Awareness: An Adaptive Measure of Phonological Awareness Appropriate for Children with Speech and/or Language Impairment

Peer reviewed
PDF on ERIC

Download full text

Skibbe, Lori E.; Bowles, Ryan P.; Goodwin, Sarah; Troia, Gary A.; Konishi, Haruka – Language, Speech, and Hearing Services in Schools, 2020

Purpose: The Access to Literacy Assessment System--Phonological Awareness (ATLAS-PA) was developed for use with children with speech and/or language impairment. The subtests (Rhyming, Blending, and Segmenting) are appropriate for children who are 3-7 years of age. ATLAS-PA is composed entirely of receptive items, incorporates individualized levels…

Descriptors: Phonological Awareness, Speech Impairments, Language Impairments, Young Children

The Access to Literacy Assessment System for Phonological Awareness: An Adaptive Measure of Phonological Awareness Appropriate for Children with Speech and/or Language Impairment

Peer reviewed
PDF on ERIC

Download full text

Direct link

Skibbe, Lori E.; Bowles, Ryan P.; Goodwin, Sarah; Troia, Gary A.; Konishi, Haruka – Grantee Submission, 2020

Purpose: The Access to Literacy Assessment System--Phonological Awareness (ATLAS-PA) was developed for use with children with speech and/or language impairment. The subtests (rhyming, blending, segmenting) are appropriate for children who are 3 to 7 years of age. ATLAS-PA is comprised entirely of receptive items, incorporates individualized levels…

Descriptors: Phonological Awareness, Speech Impairments, Language Impairments, Young Children

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7

Educational and Psychological…	5
Online Submission	4
Educational Measurement:…	3
Measurement:…	3
Smarter Balanced Assessment…	3
Grantee Submission	2
International Journal of…	2
Journal of Chemical Education	2
Journal of Educational…	2
Practical Assessment,…	2
Psychology Teaching Review	2
Alberta Journal of…	1
Applied Psychological…	1
Assessment	1
Center for Assessment and…	1
Clearing House	1
Computer Science Education	1
Early Education and…	1
Education and Information…	1
Educational Assessment	1
Educational Assessment,…	1
Educational Research and…	1
Educational Research and…	1
English Teaching Forum	1
European Journal of Physics…	1
More ▼

Hill, Heather C.	3
Abedi, Jamal	2
Blunk, Merrie	2
Bowles, Ryan P.	2
Goffney, Imani Masters	2
Goodwin, Sarah	2
Hambleton, Ronald K.	2
Konishi, Haruka	2
Skibbe, Lori E.	2
Troia, Gary A.	2
Ahmed, Wondimu	1
Akarsu, Bayram	1
Alexander, Patricia A.	1
Alghazali, Tawfeeq	1
Alina Lutsyk-King	1
Altman, Jason	1
An, Lily Shiao	1
Anani Sarab, Mohammad Reza	1
Baker, Eva L.	1
Baldwin, Peter	1
Ball, Deborah Loewenberg	1
Barnes, Laura L. B.	1
Bejar, Isaac I.	1
Bernholt, Sascha	1
More ▼