NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 1 to 15 of 892 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Laura Schildt; Bart Deygers; Albert Weideman – Language Testing, 2024
In the context of policy-driven language testing for citizenship, a growing body of research examines the political justifications and ethical implications of language requirements and test use. However, virtually no studies have looked at the role that language testers play in the evolution of language requirements. Critical gaps remain in our…
Descriptors: Language Tests, Citizenship, Educational Policy, Assessment Literacy
Peer reviewed Peer reviewed
Direct linkDirect link
Daniel Koretz – Journal of Educational and Behavioral Statistics, 2024
A critically important balance in educational measurement between practical concerns and matters of technique has atrophied in recent decades, and as a result, some important issues in the field have not been adequately addressed. I start with the work of E. F. Lindquist, who exemplified the balance that is now wanting. Lindquist was arguably the…
Descriptors: Educational Assessment, Evaluation Methods, Achievement Tests, Educational History
Peer reviewed Peer reviewed
Direct linkDirect link
Andrew P. Jaciw – American Journal of Evaluation, 2025
By design, randomized experiments (XPs) rule out bias from confounded selection of participants into conditions. Quasi-experiments (QEs) are often considered second-best because they do not share this benefit. However, when results from XPs are used to generalize causal impacts, the benefit from unconfounded selection into conditions may be offset…
Descriptors: Elementary School Students, Elementary School Teachers, Generalization, Test Bias
Peer reviewed Peer reviewed
Direct linkDirect link
Schmitt, Norbert; Nation, Paul; Kremmel, Benjamin – Language Teaching, 2020
Recently, a large number of vocabulary tests have been made available to language teachers, testers, and researchers. Unfortunately, most of them have been launched with inadequate validation evidence. The field of language testing has become increasingly more rigorous in the area of test validation, but developers of vocabulary tests have…
Descriptors: Test Construction, Test Validity, Language Tests, Test Use
Peer reviewed Peer reviewed
Direct linkDirect link
Manxia Dong; Boyu Wang – Language Testing in Asia, 2025
This study aimed to explore the relationship between students' understanding of the National Matriculation English Test (NMET) and their learning practices through standard multiple regression (SMR) and structural equation modeling (SEM) with the purpose of unraveling the working mechanism of washback. A total number of 3105 Chinese senior high…
Descriptors: Foreign Countries, High School Seniors, Test Construction, Test Use
Peer reviewed Peer reviewed
Direct linkDirect link
Shun-Fu Hu; Amery D. Wu; Jake Stone – Journal of Educational Measurement, 2025
Scoring high-dimensional assessments (e.g., > 15 traits) can be a challenging task. This paper introduces the multilabel neural network (MNN) as a scoring method for high-dimensional assessments. Additionally, it demonstrates how MNN can score the same test responses to maximize different performance metrics, such as accuracy, recall, or…
Descriptors: Tests, Testing, Scores, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Ellis, Sue; Rowe, Adele – Support for Learning, 2020
This paper describes the development and use of a tool designed to support educators to use a broad range of professional knowledge to enable inclusive literacy teaching that delivers social justice and narrows the attainment gap associated with poverty. The tool encourages teachers to formally recognise and act on a wide range of evidence about…
Descriptors: Literacy, Social Justice, Inclusion, Achievement Gap
Peer reviewed Peer reviewed
Direct linkDirect link
Knoch, Ute; Deygers, Bart; Khamboonruang, Apichat – Language Testing, 2021
Rating scale development in the field of language assessment is often considered in dichotomous ways: It is assumed to be guided either by expert intuition or by drawing on performance data. Even though quite a few authors have argued that rating scale development is rarely so easily classifiable, this dyadic view has dominated language testing…
Descriptors: Rating Scales, Test Construction, Language Tests, Test Use
Peer reviewed Peer reviewed
Direct linkDirect link
Tatiana Chaiban; Zeinab Nahle; Ghaith Assi; Michelle Cherfane – Discover Education, 2024
Background: Since it was first launched, ChatGPT, a Large Language Model (LLM), has been widely used across different disciplines, particularly the medical field. Objective: The main aim of this review is to thoroughly assess the performance of the distinct version of ChatGPT in subspecialty written medical proficiency exams and the factors that…
Descriptors: Medical Education, Accuracy, Artificial Intelligence, Computer Software
Peer reviewed Peer reviewed
Direct linkDirect link
Lovisa Alehagen; Sven Bölte; Melissa H Black – Autism: The International Journal of Research and Practice, 2025
The International Classification of Functioning, Disability, and Health is a biopsychosocial framework of health-related functioning designed to provide a unifying system for health care, social services, education, and policy sectors. Since its publication in 2001, the International Classification of Functioning has been used to guide clinical…
Descriptors: Autism Spectrum Disorders, Attention Deficit Hyperactivity Disorder, Classification, Functional Behavioral Assessment
Peer reviewed Peer reviewed
Direct linkDirect link
Ying Wu; Rita Elaine Silver; Guangwei Hu – Journal of Multilingual and Multicultural Development, 2024
The Zhuang language test ("Vahcuengh Sawcuengh Suijbingz Gaujsi", VSSG) is the first minority language test in the People's Republic of China. It was designed with multiple goals including improving Zhuang language teaching, recruiting students for relevant majors of tertiary study, identifying proficiency for work-related applications,…
Descriptors: Language Minorities, Language Tests, Second Language Learning, Second Language Instruction
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Karatas, Tuçe Öztürk – Education Quarterly Reviews, 2021
In the 21st century, with the rise of the popularity of standardized or large-scale tests, their high-stakes have started to be apparent. High-stake tests are not new, but in most cases, their current use as social practice tends to shape individuals' futures. Currently the new trend for their quality discussion aims to critically evaluate tests…
Descriptors: Language Tests, Standardized Tests, High Stakes Tests, Test Use
Im, Gwan-Hyeok; Shin, Dongil; Park, Soohyeon – Current Issues in Language Planning, 2022
This study suggests a conceptual framework for policy-driven test development and validation, using the Test of Proficiency in Korean (TOPIK) as an example context. By linking the literature on policy analysis and argument structure in the validation of testing, the strong relationships between policy and testing are illustrated. This rationalizes…
Descriptors: Language Proficiency, Language Tests, Korean, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Muhammad Yoga Prabowo; Sarah Rahmadian – TEFLIN Journal: A publication on the teaching and learning of English, 2023
The outbreak of the COVID-19 pandemic has transformed the educational landscape in a way unseen before. Educational institutions are navigating between offline and online learning worldwide. Computer-based testing is rapidly taking over paper-and-pencil testing as the dominant mode of assessment. In some settings, computer-based and…
Descriptors: English (Second Language), Second Language Learning, Test Format, Language Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Flake, Jessica Kay – Educational Psychologist, 2021
An increased focus on transparency and replication in science has stimulated reform in research practices and dissemination. As a result, the research culture is changing: the use of preregistration is on the rise, access to data and materials is increasing, and large-scale replication studies are more common. In this article, I discuss two…
Descriptors: Educational Psychology, Construct Validity, Access to Information, Test Construction
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  60