NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 1 to 15 of 674 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Michael D. Wray; Matthew R. Reynolds – Journal of Psychoeducational Assessment, 2025
The KeyMath-3 Diagnostic Assessment (KM-3) is an individually-administered math assessment used in educational placement and diagnostic decisions. It includes 10 subtests making up Basic Concepts, Operations, and Applications indexes and a "Total Test" composite that measures overall math ability. Here, covariances among subtests from…
Descriptors: Diagnostic Tests, Mathematics Tests, Arithmetic, Factor Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Aloisi, Cesare – European Journal of Education, 2023
This article considers the challenges of using artificial intelligence (AI) and machine learning (ML) to assist high-stakes standardised assessment. It focuses on the detrimental effect that even state-of-the-art AI and ML systems could have on the validity of national exams of secondary education, and how lower validity would negatively affect…
Descriptors: Standardized Tests, Test Validity, Credibility, Algorithms
Peer reviewed Peer reviewed
Direct linkDirect link
Marcos Jiménez; María Zapata-Cáceres; Marcos Román-González; Gregorio Robles; Jesús Moreno-León; Estefanía Martín-Barroso – Journal of Science Education and Technology, 2024
Computational thinking (CT) is a multidimensional term that encompasses a wide variety of problem-solving skills related to the field of computer science. Unfortunately, standardized, valid, and reliable methods to assess CT skills in preschool children are lacking, compromising the reliability of the results reported in CT interventions. To…
Descriptors: Computation, Thinking Skills, Student Evaluation, Preschool Children
Peer reviewed Peer reviewed
Direct linkDirect link
Huawei, Shi; Aryadoust, Vahid – Education and Information Technologies, 2023
Automated writing evaluation (AWE) systems are developed based on interdisciplinary research and technological advances such as natural language processing, computer sciences, and latent semantic analysis. Despite a steady increase in research publications in this area, the results of AWE investigations are often mixed, and their validity may be…
Descriptors: Writing Evaluation, Writing Tests, Computer Assisted Testing, Automation
Peer reviewed Peer reviewed
Direct linkDirect link
Tiffany Wu; Christina Weiland; Meghan McCormick; JoAnn Hsueh; Catherine Snow; Jason Sachs – Grantee Submission, 2024
The Hearts and Flowers (H&F) task is a computerized executive functioning (EF) assessment that has been used to measure EF from early childhood to adulthood. It provides data on accuracy and reaction time (RT) across three different task blocks (hearts, flowers, and mixed). However, there is a lack of consensus in the field on how to score the…
Descriptors: Scoring, Executive Function, Kindergarten, Young Children
Peer reviewed Peer reviewed
Direct linkDirect link
Susan K. Johnsen – Gifted Child Today, 2024
The author provides a checklist for educators who are selecting technically adequate tests for identifying and referring students for gifted education services and programs. The checklist includes questions related to how the test was normed, reliability and validity studies as well as questions related to types of scores, administration, and…
Descriptors: Test Selection, Academically Gifted, Gifted Education, Test Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Han, Chao – Language Testing, 2022
Over the past decade, testing and assessing spoken-language interpreting has garnered an increasing amount of attention from stakeholders in interpreter education, professional certification, and interpreting research. This is because in these fields assessment results provide a critical evidential basis for high-stakes decisions, such as the…
Descriptors: Translation, Language Tests, Testing, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Reuben S. Asempapa; Doris Lee – Discover Education, 2025
Across the world, standards and practices for preparing teachers of mathematics emphasize the importance of math modeling (MM) in developing students' mathematical thinking. The aim of this research study was to develop the Mathematical Modeling Knowledge Scale (MAMKS), capable of determining preservice teachers' (PSTs') knowledge of MM. The study…
Descriptors: Preservice Teachers, Preservice Teacher Education, Mathematics Education, Mathematics Curriculum
Peer reviewed Peer reviewed
Direct linkDirect link
Rafner, Janet; Biskjaer, Michael Mose; Zana, Blanka; Langsford, Steven; Bergenholtz, Carsten; Rahimi, Seyedahmad; Carugati, Andrea; Noy, Lior; Sherson, Jacob – Creativity Research Journal, 2022
Creativity assessments should be valid, reliable, and scalable to support various stakeholders (e.g., policy-makers, educators, corporations, and the general public) in their decision-making processes. Established initiatives toward scalable creativity assessments have relied on well-studied standardized tests. Although robust in many ways, most…
Descriptors: Creativity, Evaluation Methods, Video Games, Computer Assisted Testing
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Alatli, Betül – International Journal of Curriculum and Instruction, 2022
This study was conducted to review the use of tests. For this purpose, 45 articles in which the Turkish form of the "Test Anxiety Inventory (TAI)," which is one of the tests frequently used in the field of education, was employed and that were published between 2000 and 2020 were examined in terms of factors that should be considered in…
Descriptors: Anxiety, Likert Scales, Test Anxiety, Test Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Khodi, Ali – Language Testing in Asia, 2021
The present study attempted to to investigate factors which affect EFL writing scores through using generalizability theory (G-theory). To this purpose, one hundred and twenty students participated in one independent and one integrated writing tasks. Proceeding, their performances were scored by six raters: one self-rating, three peers,-rating and…
Descriptors: Writing Tests, Scores, Generalizability Theory, English (Second Language)
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Beheshti, Shima; Safa, Mohammad Ahmadi – Iranian Journal of Language Teaching Research, 2023
The indefinite nature of test fairness and different interpretations and definitions of the concept have stirred a lot of controversy over the years, necessitating the reconceptualization of the concept. On this basis, this study aimed to explore the empirical validity of Kunnan's (2008) Test Fairness Framework (TFF) and revisit the established…
Descriptors: Test Bias, Equal Education, Grounded Theory, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Lyrica Lucas; Anum Khushal; Robert Mayes; Brian A. Couch; Joseph Dauer – International Journal of Science Education, 2025
Educational reform priorities such as emphasis on quantitative modelling (QM) have positioned undergraduate biology instructors as designers of QM experiences to engage students in authentic science practices that support the development of data-driven and evidence-based reasoning. Yet, little is known about how biology instructors adapt to the…
Descriptors: Undergraduate Students, College Science, Biology, Classroom Observation Techniques
Peer reviewed Peer reviewed
Direct linkDirect link
Fergadiotis, Gerasimos; Casilio, Marianne; Dickey, Michael Walsh; Steel, Stacey; Nicholson, Hannele; Fleegle, Mikala; Swiderski, Alexander; Hula, William D. – Journal of Speech, Language, and Hearing Research, 2023
Purpose: Item response theory (IRT) is a modern psychometric framework with several advantageous properties as compared with classical test theory. IRT has been successfully used to model performance on anomia tests in individuals with aphasia; however, all efforts to date have focused on noun production accuracy. The purpose of this study is to…
Descriptors: Item Response Theory, Psychometrics, Verbs, Naming
Peer reviewed Peer reviewed
Direct linkDirect link
Nakamura, Keita – Language Testing in Asia, 2022
Background: This study investigated the scoring and criterion-related validity of the TEAP, a newly developed Test of English for Academic Purposes. In this study, scoring validity was examined by investigating the factor structure, while criterion-related validity was examined by first investigating the longitudinal change of test takers'…
Descriptors: Test Validity, English for Academic Purposes, Language Tests, Scoring
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  ...  |  45