NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 1 to 15 of 437 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Folger, Timothy D.; Bostic, Jonathan; Krupa, Erin E. – Educational Measurement: Issues and Practice, 2023
Validity is a fundamental consideration of test development and test evaluation. The purpose of this study is to define and reify three key aspects of validity and validation, namely test-score interpretation, test-score use, and the claims supporting interpretation and use. This study employed a Delphi methodology to explore how experts in…
Descriptors: Test Interpretation, Scores, Test Use, Test Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Laura Schildt; Bart Deygers; Albert Weideman – Language Testing, 2024
In the context of policy-driven language testing for citizenship, a growing body of research examines the political justifications and ethical implications of language requirements and test use. However, virtually no studies have looked at the role that language testers play in the evolution of language requirements. Critical gaps remain in our…
Descriptors: Language Tests, Citizenship, Educational Policy, Assessment Literacy
Peer reviewed Peer reviewed
Direct linkDirect link
Schmitt, Norbert; Nation, Paul; Kremmel, Benjamin – Language Teaching, 2020
Recently, a large number of vocabulary tests have been made available to language teachers, testers, and researchers. Unfortunately, most of them have been launched with inadequate validation evidence. The field of language testing has become increasingly more rigorous in the area of test validation, but developers of vocabulary tests have…
Descriptors: Test Construction, Test Validity, Language Tests, Test Use
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Ching-Ni Hsieh – ETS Research Report Series, 2023
Research in validity suggests that stakeholders' interpretation and use of test results should be an aspect of validity. Claims about the meaningfulness of test score interpretations and consequences of test use should be backed by evidence that stakeholders understand the definition of the construct assessed and the score report information. The…
Descriptors: Foreign Countries, Language Proficiency, English (Second Language), Language Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Knoch, Ute; Deygers, Bart; Khamboonruang, Apichat – Language Testing, 2021
Rating scale development in the field of language assessment is often considered in dichotomous ways: It is assumed to be guided either by expert intuition or by drawing on performance data. Even though quite a few authors have argued that rating scale development is rarely so easily classifiable, this dyadic view has dominated language testing…
Descriptors: Rating Scales, Test Construction, Language Tests, Test Use
Peer reviewed Peer reviewed
Direct linkDirect link
Matt I. Brown; Patrick R. Heck; Christopher F. Chabris – Journal of Autism and Developmental Disorders, 2024
The Social Shapes Test (SST) is a measure of social intelligence which does not use human faces or rely on extensive verbal ability. The SST has shown promising validity among adults without autism spectrum disorder (ASD), but it is uncertain whether it is suitable for adults with ASD. We find measurement invariance between adults with (n = 229)…
Descriptors: Interpersonal Competence, Autism Spectrum Disorders, Emotional Intelligence, Verbal Ability
Peer reviewed Peer reviewed
Direct linkDirect link
Corral, Daniel; Carpenter, Shana K.; Perkins, Kyle; Gentile, Douglas A. – Applied Cognitive Psychology, 2020
Online practice quizzes can be used to supplement instruction in the classroom. Such quizzes can engage retrieval practice, thereby improving learning and retention. However, despite their potential benefits, recent work suggests that students typically underutilize online practice quizzes. This article reports an observational classroom study, in…
Descriptors: Student Evaluation, Web Based Instruction, Test Use, Study Habits
Peer reviewed Peer reviewed
Direct linkDirect link
Tatiana Chaiban; Zeinab Nahle; Ghaith Assi; Michelle Cherfane – Discover Education, 2024
Background: Since it was first launched, ChatGPT, a Large Language Model (LLM), has been widely used across different disciplines, particularly the medical field. Objective: The main aim of this review is to thoroughly assess the performance of the distinct version of ChatGPT in subspecialty written medical proficiency exams and the factors that…
Descriptors: Medical Education, Accuracy, Artificial Intelligence, Computer Software
Peer reviewed Peer reviewed
Direct linkDirect link
Jieun Kim; Daniel Richard Isbell – Language Assessment Quarterly, 2024
The ACTFL Assessment of Performance Toward Proficiency in Languages (AAPPL, https://www.actfl.n.d.org/assessments/k-12-assessments/aappl) assesses proficiency in 11 languages for students in grades 3 to 12 and is often used to award the Seal of Biliteracy. While arguments for the valid interpretation and uses of the AAPPL have previously been…
Descriptors: Language Tests, Second Language Learning, Second Language Instruction, Language Proficiency
Peer reviewed Peer reviewed
Direct linkDirect link
R. Lanai Jennings; Megan Midkiff; Emily Nestor McCauley; Jeremy Lopuch; Sandra Stroebel; Rachel James; Mary Toler; Rebecca Wendell; Paula King; Mallory Frampton – Contemporary School Psychology, 2024
Reading comprehension is one of the most valuable academic skills taught in school. Selecting the appropriate assessment instrument to ensure early identification and intervention is important as there is an amalgam of cognitive abilities and academic skills involved in reading comprehension. The GORT-5 is the most recent edition of a test that…
Descriptors: Test Validity, Diagnostic Tests, Reading Comprehension, Early Intervention
Dadey, Nathan; Keng, Leslie; Boyer, Michelle; Marion, Scott – National Center for the Improvement of Educational Assessment, 2021
State summative educational assessment is about to begin in earnest. Rightfully, many are raising questions about the quality, meaning, and appropriate use of the assessment results. This document was written to support state educational agencies (SEAs) and their assessment providers in devising effective and efficient analysis plans. This…
Descriptors: Educational Assessment, Summative Evaluation, Student Evaluation, Test Use
Peer reviewed Peer reviewed
Direct linkDirect link
Jonathan Schweig; Megan Kuhfeld; Melissa Kay Diliberti; Andrew McEachin; Louis T. Mariano – RAND Corporation, 2022
In this report, RAND researchers investigate one specific issue that may contaminate utilization of COVID-19--era school-aggregate scores and result in faulty comparisons with historical and other proximal aggregate scores: changes in school composition over time. To investigate this issue, they examine data from NWEA's Measures of Academic…
Descriptors: School Demography, COVID-19, Pandemics, Test Use
Jonathan Schweig; Megan Kuhfeld; Melissa Kay Diliberti; Andrew McEachin; Louis T. Mariano – Grantee Submission, 2022
School officials regularly use school-aggregate test scores to monitor school performance and make policy decisions. After the U.S. Department of Education offered assessment waivers to all 50 states in 2019-2020, many educators and policymakers advocated for assessment programs to be restarted in the 2020-2021 school year to evaluate the state of…
Descriptors: School Demography, COVID-19, Pandemics, Test Use
Peer reviewed Peer reviewed
Direct linkDirect link
Peng, Yue; Yan, Wei; Cheng, Liying – Language Testing, 2021
This test review focuses on the current version (2009) of [Chinese characters omitted] (Hanyu Shuiping Kaoshi), literally translated as the Chinese Language Proficiency Test and abbreviated as HSK. Tailored to non-native speakers of the Chinese language, this test consists of six proficiency levels (Levels 1 and 2 as beginners, Levels 3 and 4 as…
Descriptors: Language Proficiency, Language Tests, Chinese, Decision Making
Previous Page | Next Page ยป
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  30